tltorch.factorized_layers.FactorizedLinear

class tltorch.factorized_layers.FactorizedLinear(in_tensorized_features, out_tensorized_features, bias=True, factorization='cp', rank='same', implementation='factorized', n_layers=1, checkpointing=False, device=None, dtype=None)[source]

Tensorized Fully-Connected Layers

The weight matrice is tensorized to a tensor of size (*in_tensorized_features, *out_tensorized_features). That tensor is expressed as a low-rank tensor.

During inference, the full tensor is reconstructed, and unfolded back into a matrix, used for the forward pass in a regular linear layer.

Parameters:
in_tensorized_featuresint tuple

shape to which the input_features dimension is tensorized to e.g. if in_features is 8 in_tensorized_features could be (2, 2, 2) should verify prod(in_tensorized_features) = in_features

out_tensorized_featuresint tuple

shape to which the input_features dimension is tensorized to.

factorizationstr, default is ‘cp’
rankint tuple or str
implementation{‘factorized’, ‘reconstructed’}, default is ‘factorized’

which implementation to use for forward function: - if ‘factorized’, will directly contract the input with the factors of the decomposition - if ‘reconstructed’, the full weight matrix is reconstructed from the factorized version and used for a regular linear layer forward pass.

n_layersint, default is 1

number of linear layers to be parametrized with a single factorized tensor

biasbool, default is True
checkpointingbool

whether to enable gradient checkpointing to save memory during training-mode forward, default is False

devicePyTorch device to use, default is None
dtypePyTorch dtype, default is None

Methods

forward(x[, indices])

Defines the computation performed at every call.

from_linear(linear, in_tensorized_features, ...)

Class method to create an instance from an existing linear layer

from_linear_list(linear_list, ...[, bias, ...])

Class method to create an instance from an existing linear layer

get_linear

reset_parameters

forward(x, indices=0)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

classmethod from_linear(linear, in_tensorized_features, out_tensorized_features, rank, bias=True, factorization='CP', implementation='reconstructed', checkpointing=False, decomposition_kwargs={})[source]

Class method to create an instance from an existing linear layer

Parameters:
lineartorch.nn.Linear

layer to tensorize

tensorized_shapetuple

shape to tensorized the factorized_weight matrix to. Must verify np.prod(tensorized_shape) == np.prod(linear.factorized_weight.shape)

factorizationstr, default is ‘cp’
implementationstr

which implementation to use for forward function. support ‘factorized’ and ‘reconstructed’, default is ‘factorized’

checkpointingbool

whether to enable gradient checkpointing to save memory during training-mode forward, default is False

rank{rank of the decomposition, ‘same’, float}

if float, percentage of parameters of the original factorized_weights to use if ‘same’ use the same number of parameters

biasbool, default is True
classmethod from_linear_list(linear_list, in_tensorized_features, out_tensorized_features, rank, bias=True, factorization='CP', implementation='reconstructed', checkpointing=False, decomposition_kwargs={'init': 'random'})[source]

Class method to create an instance from an existing linear layer

Parameters:
lineartorch.nn.Linear

layer to tensorize

tensorized_shapetuple

shape to tensorized the weight matrix to. Must verify np.prod(tensorized_shape) == np.prod(linear.weight.shape)

factorizationstr, default is ‘cp’
implementationstr

which implementation to use for forward function. support ‘factorized’ and ‘reconstructed’, default is ‘factorized’

checkpointingbool

whether to enable gradient checkpointing to save memory during training-mode forward, default is False

rank{rank of the decomposition, ‘same’, float}

if float, percentage of parameters of the original weights to use if ‘same’ use the same number of parameters

biasbool, default is True