Tensor Decompositions

Tensors are multidimensional generalizations of vectors and matrices represented by arrays with $n$ indices, i.e.,

$$\mathbf{T} \in \mathbb{C}^D = \mathbb{C}^{d_1 \times \dots \times d_n}$$

is a tensor of order $n$, where $D = (d_1, \dots, d_n)$ is called mode set. In what follows, we will denote tensors by bold letters and refer to an element of a tensor $\mathbf{T}$ using subscript indices. The storage consumption of such a tensor can be estimated as $O(d^n)$ where $d$ is the maximum of all modes. Thus, storing a higher-order tensor is in general infeasible for large n since the number of elements of a tensor grows exponentially with the order - this is also known as the curse of dimensionality. However, it is possible to mitigate this problem by exploiting low-rank tensor approximations in many cases.

The essential operation used for tensor decompositions is the so-called tensor product: Given a tensor $\mathbf{T}$ of order $n$ and a tensor $\mathbf{U}$ of order $m$, the tensor product is defined by

$$(\mathbf{T} \otimes \mathbf{U})_{x_1, \dots, x_n, y_1, \dots , y_m} = \mathbf{T}_{x_1, \dots, x_n} \mathbf{U}_{y_1, \dots, y_m},$$

for any possible combination of mode indices.

In order to efficiently represent high-dimensional systems - e.g., probability distributions, transformed data tensors, or quantum states - we focus on tensor trains (TT) a.k.a. matrix product states (MPS). The TT/MPS format is a special case of the more general hierarchical Tucker format and turned out to be a promising candidate in terms of storage consumption as well as computational robustness. A tensor $\mathbf{T}$ is said to be in the MPS/TT format if

$$\mathbf{T} = \sum_{k_0=1}^{r_0} \dots \sum_{k_n=1}^{r_n} \bigotimes_{i=1}^n \mathbf{T}^{(i)}_{k_{i-1},:,k_i} = \sum_{k_0=1}^{r_0} \dots \sum_{k_n=1}^{r_n} \mathbf{T}^{(1)}_{k_0,:,k_1} \otimes \dots \otimes \mathbf{T}^{(n)}_{k_{n-1}, :, k_n}.$$

The variables $r_i$ are called bond dimensions or TT ranks and it holds that $r_0 = r_n = 1$ and $r_i \geq 1$ for $i=1 , \dots, n-1$. The tensors $\mathbf{T}^{(i)} \in \mathbb{C}^{r_{i-1} \times d_i \times r_i}$ are called (TT) cores. Each element of the tensor $\mathbf{T}$ can be written as

$$\mathbf{T}_{x_1, \dots, x_n} = \mathbf{T}^{(1)}_{1,x_1,:} \mathbf{T}^{(2)}_{:,x_2,:} \cdots \mathbf{T}^{(n-1)}_{:,x_{n-1},:} \mathbf{T}^{(n)}_{:,x_n,1},$$

which explains the origin of the name matrix product states (MPS). If the ranks are small enough, we may reduce the storage consumption of an order-$n$ tensor significantly: Instead of an exponential dependence, the storage then depends only linearly on the order and can be estimated as $\mathcal{O}(r^2 d n)$, where $r$ is the maximum over all ranks. That is, if the underlying correlation structure admits such a low-rank decomposition, an enormous reduction in complexity can be achieved.

A linear operator $\mathbf{G} \in \mathbb{C}^{D \times D}$ in the MPO/TT format, can be written as

$$ \mathbf{G} = \sum_{k_0=1}^{R_0} \dots \sum_{k_n=1}^{R_n} \bigotimes_{i=1}^n \mathbf{G}^{(i)}_{k_{i-1},:,:,k_i} = \sum_{k_0=1}^{R_0} \dots \sum_{k_n=1}^{R_n} \mathbf{G}^{(1)}_{k_0,:,:,k_1} \otimes \dots \otimes \mathbf{G}^{(n)}_{k_{n-1}, :, :, k_n}.$$

Here, the cores are tensors of order $4$. Figure 1 (a) and (b) show the graphical representation of an MPS $\mathbf{T} \in \mathbb{C}^{D}$ and an MPO $\mathbf{G} \in \mathbb{C}^{D \times D}$ with $D=(d_1, \dots, d_5)$, respectively.

(a)

(b)

Figure 1: Graphical representation of the MPS/MPO format: A core is depicted by a circle with different arms indicating the modes of the tensor and the rank indices. (a) Tensor of order 5 as MPS with ranks $r_1, r_2, r_3, r_4$. The first and the last core are matrices, the other cores are tensors of order 3. (b) An MPO of order 10 with ranks $R_1, R_2, R_3, R_4$. The first and the last core are tensors of order 3, the other cores are tensors of order 4.

Tensor trains have become a widely-studied concept, which found its way into various scientific fields such as quantum mechanics, dynamical systems, system identification, and machine learning.