{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Constrained CP decomposition in Tensorly >=0.7\nOn this page, you will find examples showing how to use constrained CP/Parafac.\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Introduction\nSince version 0.7, Tensorly includes constrained CP decomposition which penalizes or\nconstrains factors as chosen by the user. The proposed implementation of constrained CP uses the \nAlternating Optimization Alternating Direction Method of Multipliers (AO-ADMM) algorithm from [1] which\nsolves alternatively convex optimization problem using primal-dual optimization. In constrained CP\ndecomposition, an auxilliary factor is introduced which is constrained or regularized using an operator called the \nproximal operator. The proximal operator may therefore change according to the selected constraint or penalization.\n\nTensorly provides several constraints and their corresponding proximal operators, each can apply to one or all factors in the CP decomposition:\n\n1. Non-negativity\n * `non_negative` in signature\n * Prevents negative values in CP factors.\n2. L1 regularization\n * `l1_reg` in signature\n * Adds a L1 regularization term on the CP factors to the CP cost function, this promotes sparsity in the CP factors. The user chooses the regularization amount.\n3. L2 regularization\n * `l2_reg` in signature\n * Adds a L2 regularization term on the CP factors to the CP cost function. The user chooses the regularization amount.\n4. L2 square regularization\n * `l2_square_reg` in signature\n * Adds a L2 regularization term on the CP factors to the CP cost function. The user chooses the regularization amount.\n5. Unimodality\n * `unimodality` in signature\n * This constraint acts columnwise on the factors\n * Impose that each column of the factors is unimodal (there is only one local maximum, like a Gaussian).\n6. Simplex\n * `simplex` in signature\n * This constraint acts columnwise on the factors\n * Impose that each column of the factors lives on the simplex or user-defined radius (entries are nonnegative and sum to a user-defined positive parameter columnwise).\n7. Normalization\n * `normalize` in signature\n * Impose that the largest absolute value in the factors elementwise is 1.\n8. Normalized sparsity\n * `normalized_sparsity` in signature\n * This constraint acts columnwise on the factors\n * Impose that the columns of factors are both normalized with the L2 norm, and k-sparse (at most k-nonzeros per column) with k user-defined.\n9. Soft sparsity\n * `soft_sparsity` in signature\n * This constraint acts columnwise on the factors\n * Impose that the columns of factors have L1 norm bounded by a user-defined threshold.\n10. Smoothness\n * `smoothness` in signature\n * This constraint acts columnwise on the factors\n * Favor smoothness in factors columns by penalizing the L2 norm of finite differences. The user chooses the regularization amount. The proximal operator in fact solves a banded system.\n11. Monotonicity\n * `monotonicity` in signature\n * This constraint acts columnwise on the factors\n * Impose that the factors are either always increasing or decreasing (user-specified) columnwise. This is based on isotonic regression.\n12. Hard sparsity\n * `hard_sparsity` in signature\n * This constraint acts columnwise on the factors\n * Impose that each column of the factors has at most k nonzero entries (k is user-defined).\n\nWhile some of these constraints (2, 3, 4, 6, 8, 9, 12) require a scalar\ninput as its parameter or regularizer, boolean input could be enough\nfor other constraints (1, 5, 7, 10, 11). Selection of one of these\nconstraints for all mode (or factors) or using different constraints for different modes are both supported.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\nimport tensorly as tl\nfrom tensorly.decomposition import constrained_parafac\nimport matplotlib.pyplot as plt\n\nnp.set_printoptions(precision=2)\n\n# tensor generation\ntensor = tl.tensor(np.random.rand(6, 8, 10))\nrank = 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using one constraint for all modes\nConstraints are inputs of the constrained_parafac function, which itself uses the\n``tensorly.tenalg.proximal.validate_constraints`` function in order to process the input\nof the user. If a user wants to use the same constraint for all modes, an\ninput (bool or a scalar value or list of scalar values) should be given to this constraint.\nAssume, one wants to use unimodality constraint for all modes. Since it does not require\nany scalar input, unimodality can be imposed by writing `True` for `unimodality`:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"_, factors = constrained_parafac(tensor, rank=rank, unimodality=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This constraint imposes that each column of all the factors in the CP decomposition are unimodal:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"fig = plt.figure()\nfor i in range(rank):\n plt.plot(factors[0][:, i])\n plt.legend(['1. column', '2. column', '3. column'], loc='upper left')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Constraints requiring a scalar input can be used similarly as follows:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"_, factors = constrained_parafac(tensor, rank=rank, l1_reg=0.05)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The same regularization coefficient l1_reg is used for all the modes. Here the l1 penalization induces sparsity given that the regularization coefficient is large enough.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"fig = plt.figure()\nplt.title('Histogram of 1. factor')\n_, _, _ = plt.hist(factors[0].flatten())\n\nfig = plt.figure()\nplt.title('Histogram of 2. factor')\n_, _, _ = plt.hist(factors[1].flatten())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using one constraint for some modes\nAs a second option, constraint can be used for only a few selected modes by using\na python dictionary:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"_, factors = constrained_parafac(tensor, rank=rank, non_negative={0: True, 2: True})\nprint(\"1. factor\\n\", factors[0])\nprint(\"2. factor\\n\", factors[1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since only the first and last factors are chosen, entries on the second mode factor could be negative.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using a constraint with the different scalar inputs for each mode\nOne may prefer different scalar value for each mode. It is possible by\nusing a list structure:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"_, factors = constrained_parafac(tensor, rank=rank, l1_reg=[0.01, 0.02, 0.03])\n\nfig = plt.figure()\nplt.title('Histogram of 1. factor')\n_, _, _ = plt.hist(factors[0].flatten())\n\nfig = plt.figure()\nplt.title('Histogram of 2. factor')\n_, _, _ = plt.hist(factors[1].flatten())\n\nfig = plt.figure()\nplt.title('Histogram of 3. factor')\n_, _, _ = plt.hist(factors[2].flatten())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using different constraints for each mode\nTo use different constraint for different modes, the dictionary structure\nshould be preferred:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"_, factors = constrained_parafac(tensor, rank=rank, non_negative={1:True}, l1_reg={0: 0.01},\n l2_square_reg={2: 0.01})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the dictionary, `key` is the selected mode and `value` is a scalar value or\nonly `True` depending on the selected constraint.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(\"1. factor\\n\", factors[0])\nprint(\"2. factor\\n\", factors[1])\nprint(\"3. factor\\n\", factors[2])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Thus, first factor will be non-negative, second factor will be regularized\nby $0.01$ with $l_1$ and last factor will be regularized by\n$0.01$ with $l_2^2$.\n\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## References\n\n[1] Huang, Kejun, Nicholas D. Sidiropoulos, and Athanasios P. Liavas.\n\"A flexible and efficient algorithmic framework for constrained\nmatrix and tensor factorization.\"\nIEEE Transactions on Signal Processing 64.19 (2016): 5052-5065.\n[(Online version)](https://ieeexplore.ieee.org/document/7484753)\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
}
},
"nbformat": 4,
"nbformat_minor": 0
}