tltorch.factorized_layers
.FactorizedLinear
- class tltorch.factorized_layers.FactorizedLinear(in_tensorized_features, out_tensorized_features, bias=True, factorization='cp', rank='same', implementation='factorized', n_layers=1, checkpointing=False, device=None, dtype=None)[source]
Tensorized Fully-Connected Layers
The weight matrice is tensorized to a tensor of size (*in_tensorized_features, *out_tensorized_features). That tensor is expressed as a low-rank tensor.
During inference, the full tensor is reconstructed, and unfolded back into a matrix, used for the forward pass in a regular linear layer.
- Parameters:
- in_tensorized_featuresint tuple
shape to which the input_features dimension is tensorized to e.g. if in_features is 8 in_tensorized_features could be (2, 2, 2) should verify prod(in_tensorized_features) = in_features
- out_tensorized_featuresint tuple
shape to which the input_features dimension is tensorized to.
- factorizationstr, default is ‘cp’
- rankint tuple or str
- implementation{‘factorized’, ‘reconstructed’}, default is ‘factorized’
which implementation to use for forward function: - if ‘factorized’, will directly contract the input with the factors of the decomposition - if ‘reconstructed’, the full weight matrix is reconstructed from the factorized version and used for a regular linear layer forward pass.
- n_layersint, default is 1
number of linear layers to be parametrized with a single factorized tensor
- biasbool, default is True
- checkpointingbool
whether to enable gradient checkpointing to save memory during training-mode forward, default is False
- devicePyTorch device to use, default is None
- dtypePyTorch dtype, default is None
Methods
forward
(x[, indices])Defines the computation performed at every call.
from_linear
(linear, in_tensorized_features, ...)Class method to create an instance from an existing linear layer
from_linear_list
(linear_list, ...[, bias, ...])Class method to create an instance from an existing linear layer
get_linear
reset_parameters
- forward(x, indices=0)[source]
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- classmethod from_linear(linear, in_tensorized_features, out_tensorized_features, rank, bias=True, factorization='CP', implementation='reconstructed', checkpointing=False, decomposition_kwargs={})[source]
Class method to create an instance from an existing linear layer
- Parameters:
- lineartorch.nn.Linear
layer to tensorize
- tensorized_shapetuple
shape to tensorized the factorized_weight matrix to. Must verify np.prod(tensorized_shape) == np.prod(linear.factorized_weight.shape)
- factorizationstr, default is ‘cp’
- implementationstr
which implementation to use for forward function. support ‘factorized’ and ‘reconstructed’, default is ‘factorized’
- checkpointingbool
whether to enable gradient checkpointing to save memory during training-mode forward, default is False
- rank{rank of the decomposition, ‘same’, float}
if float, percentage of parameters of the original factorized_weights to use if ‘same’ use the same number of parameters
- biasbool, default is True
- classmethod from_linear_list(linear_list, in_tensorized_features, out_tensorized_features, rank, bias=True, factorization='CP', implementation='reconstructed', checkpointing=False, decomposition_kwargs={'init': 'random'})[source]
Class method to create an instance from an existing linear layer
- Parameters:
- lineartorch.nn.Linear
layer to tensorize
- tensorized_shapetuple
shape to tensorized the weight matrix to. Must verify np.prod(tensorized_shape) == np.prod(linear.weight.shape)
- factorizationstr, default is ‘cp’
- implementationstr
which implementation to use for forward function. support ‘factorized’ and ‘reconstructed’, default is ‘factorized’
- checkpointingbool
whether to enable gradient checkpointing to save memory during training-mode forward, default is False
- rank{rank of the decomposition, ‘same’, float}
if float, percentage of parameters of the original weights to use if ‘same’ use the same number of parameters
- biasbool, default is True