Computing Tucker via the HOSVD

Contents

Higher-order Singular Value Decomposition (HOSVD) and Sequentially-truncased HOSVD (ST-HOSVD)

The HOSVD computes a Tucker decomposition of a tensor via a simple process. For each mode k, it computes the r_k leading left singular values of the matrix unfolding and stores those as factor matrix U_k. Then it computes a ttm of the original tensor and all the factor matrices to yield the core of size r_1 x r_2 x ... x r_d. The core and factor matrices are used to form the ttensor. The values of r_k that lead to a good approximation can be computed automatically to yield a specified error tolerance; this is recommended and the default in our code. The ST-HOSVD is an improvement on the HOSVD that does a TTM in each mode before moving on to the next mode. This has the advantage of shrinking the tensor at each step and reducing subsequent computations. ST-HOSVD is the default in the hosvd code.

Simple example of usage

% Create random 50 x 40 x 30 tensor with 5 x 4 x 3 core
info = create_problem('Type','Tucker','Num_Factors',[5 4 3],'Size',[50 40 30],'Noise',0.01);
X = info.Data;

% Compute HOSVD with desired relative error = 0.1
T = hosvd(X,0.1);

% Check size of core
coresize = size(T.core)

% Check relative error
relerr = norm(X-full(T))/norm(X)
Computing HOSVD...
Size of core: 5 x 4 x 3
||X-T||/||X|| = 0.00995932 <=0.100000 (tol)


coresize =

     5     4     3


relerr =

    0.0100

Generate a core with different accuracies for different sizes

We will create a core tensor that has is nearly block diagonal. The blocks are expontentially decreasing in norm, with the idea that we can pick off one block at a time as we increate the prescribed accuracy of the HOSVD. To do this, we use tenrandblk.

% Block sizes (need not be cubic). Number of rows is the number
% of levels and number of columns is the order of the tensor.
bsz = [3 2 1; 2 2 2; 2 3 4];

% Squared norm of each block. Must be length L and sum to <= 1
bsn = [.9 .09 .009]';

% Create core tensor with given block structure and norm 1
G = tenrandblk(bsz,bsn,true);
Created block of size 3 x 2 x 1 with norm (0.948683)^2=0.900000
Created block of size 2 x 2 x 2 with norm (0.300000)^2=0.090000
Created block of size 2 x 3 x 4 with norm (0.094868)^2=0.009000
Created tensor of size 7 x 7 x 7 with off-block-diaognal norm (0.031623)^2=0.001000
fprintf('Size of G: %s\n', tt_size2str(size(G)));
Size of G: 7 x 7 x 7

Generate data tensor with core described above

We take the core G and embed into into a larger tensor X by using orthogonal transformations. The true rank of this tensor is equal to the size of G.

% Size of X
xsz = [20 20 20];

% Create orthogonal matrices
U = cell(3,1);
for k = 1:3
    V = matrandorth(xsz(k));
    U{k} = V(:,1:size(G,k));
end

% Create X
X = full(ttensor(G,U));

% The norm should be unchanged
fprintf('||X||=%f\n',norm(X));
||X||=1.000000

Compute (full) HOSVD

We compute the ST-HOSVD using the hosvd method. We specify the tolerance to close to machine precision. Ideally, it finds a core that is the same size as G.

fprintf('ST-HOSVD...\n');
T = hosvd(X,2*sqrt(eps));
ST-HOSVD...
Computing HOSVD...
Size of core: 7 x 7 x 7
||X-T||/||X|| = 4.50378e-15 <=0.000000 (tol)

Compute low-rank HOSVD approximation

The norm squared of the first two blocks of G is 0.99, so specifying an error of 1e-2 should yield a core of size 4 x 4 x 3. However, the conservative nature of the algorithm means that it may pick something larger. We can compensate by specifying a larger tolerance.

% Using 1e-2 exactly is potentially too conservative...
fprintf('Result with tol = sqrt(1e-2):\n');
T = hosvd(X, sqrt(1e-2),'verbosity',1);

% But a small multiple (i.e., |ndims(X)|) usually works...
fprintf('Result with tol = sqrt(3e-2):\n');
T = hosvd(X, sqrt(3e-2),'verbosity',1);
Result with tol = sqrt(1e-2):
Computing HOSVD...
Size of core: 6 x 6 x 5
||X-T||/||X|| = 0.0577333 <=0.100000 (tol)

Result with tol = sqrt(3e-2):
Computing HOSVD...
Size of core: 4 x 4 x 3
||X-T||/||X|| = 0.0988936 <=0.173205 (tol)

Similarly, lhe norm squared of the first block of G is 0.9, so specifying an error of 1e-1 should result in a core of size 3 x 2 x 1.

% Using 1e-1 exactly is potentially too conservative...
fprintf('Result with tol = sqrt(1e-1):\n');
T = hosvd(X, sqrt(1e-1),'verbosity',1);

% But a small multiple (i.e., |ndims(X)|) usually works...
fprintf('Result with tol = sqrt(3e-1):\n');
T = hosvd(X, sqrt(3e-1),'verbosity',1);
Result with tol = sqrt(1e-1):
Computing HOSVD...
Size of core: 4 x 4 x 3
||X-T||/||X|| = 0.0988936 <=0.316228 (tol)

Result with tol = sqrt(3e-1):
Computing HOSVD...
Size of core: 2 x 2 x 1
||X-T||/||X|| = 0.316115 <=0.547723 (tol)

Verbosity - Getting more or less information.

Setting the verbosity to zero suppresses all output. Cranking up the verbosity gives some insight into the decision-making process...

% Example 1
T = hosvd(X, sqrt(3e-1),'verbosity',10);
Computing HOSVD...
||X||^2 = 1
tol = 0.547723
eigenvalue sum threshold = tol^2 ||X||^2 / d = 0.1
Reverse cummulative sum of evals of Gram matrix:
1: 1.0000 
2: 0.3963 <-- Cutoff
3: 0.0996 
4: 0.0532 
5: 0.0093 
6: 0.0039 
7: 0.0001 
8: 0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Reverse cummulative sum of evals of Gram matrix:
1: 0.9004 
2: 0.2968 <-- Cutoff
3: 0.0003 
4: 0.0002 
5: 0.0001 
6: 0.0000 
7: 0.0000 
8: 0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Reverse cummulative sum of evals of Gram matrix:
1: 0.9001 <-- Cutoff
2: 0.0001 
3: 0.0000 
4: 0.0000 
5: 0.0000 
6: -0.0000 
7: -0.0000 
8: -0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Size of core: 2 x 2 x 1
||X-T||/||X|| = 0.316115 <=0.547723 (tol)

Example 2

T = hosvd(X, sqrt(3*eps),'verbosity',10);
Computing HOSVD...
||X||^2 = 1
tol = 2.58096e-08
eigenvalue sum threshold = tol^2 ||X||^2 / d = 2.22045e-16
Reverse cummulative sum of evals of Gram matrix:
1: 1.0000 
2: 0.3963 
3: 0.0996 
4: 0.0532 
5: 0.0093 
6: 0.0039 
7: 0.0001 <-- Cutoff
8: 0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Reverse cummulative sum of evals of Gram matrix:
1: 1.0000 
2: 0.3963 
3: 0.0997 
4: 0.0531 
5: 0.0094 
6: 0.0042 
7: 0.0015 <-- Cutoff
8: -0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Reverse cummulative sum of evals of Gram matrix:
1: 1.0000 
2: 0.0998 
3: 0.0533 
4: 0.0095 
5: 0.0056 
6: 0.0032 
7: 0.0016 <-- Cutoff
8: -0.0000 
9: -0.0000 
10: -0.0000 
11: -0.0000 
12: -0.0000 
13: -0.0000 
14: -0.0000 
15: -0.0000 
16: -0.0000 
17: -0.0000 
18: -0.0000 
19: -0.0000 
20: -0.0000 
Size of core: 7 x 7 x 7
||X-T||/||X|| = 4.50378e-15 <=0.000000 (tol)

Specify the ranks

If you know the rank you want, you can specify it. But there's no guarantee that it will satisfy the specified tolerance. In such cases, the method will throw a warning.

% Rank is okay
T = hosvd(X,sqrt(3e-1),'ranks',bsz(1,:));

% Rank is too small for the specified error
T = hosvd(X,sqrt(3e-1),'ranks',[1 1 1]);

% But you can set the error to the tensor norm to make the warning go away
T = hosvd(X,norm(X),'ranks',[1 1 1]);
Computing HOSVD...
Size of core: 3 x 2 x 1
||X-T||/||X|| = 0.316113 <=0.547723 (tol)

Computing HOSVD...
Size of core: 1 x 1 x 1
Tolerance not satisfied!! ||X-T||/||X|| = 0.629625 >=0.547723 (tol)
Warning: Specified tolerance was not achieved 

Computing HOSVD...
Size of core: 1 x 1 x 1
||X-T||/||X|| = 0.629625 <=1.000000 (tol)

Specify the mode order

It's also possible to specify the order of the modes. The default is 1:ndims(X).

T = hosvd(X,sqrt(3e-1),'dimorder',ndims(X):-1:1);
Computing HOSVD...
Size of core: 2 x 2 x 1
||X-T||/||X|| = 0.316106 <=0.547723 (tol)

Generate bigger data tensor with core described above

Uses the same procedure as before, but now the size is bigger.

% Size of Y
ysz = [100 100 100];

% Create orthogonal matrices
U = cell(3,1);
for k = 1:3
    V = matrandorth(ysz(k));
    U{k} = V(:,1:size(G,k));
end

% Create Y
Y = full(ttensor(G,U));

ST-HOSVD compared to HOSVD

The answers are essentially the same for the sequentially-truncated HOSVD and the HOSVD...

fprintf('ST-HOSVD...\n');
T = hosvd(Y,2*sqrt(eps));
fprintf('HOSVD...\n');
T = hosvd(Y,2*sqrt(eps),'sequential',false);
ST-HOSVD...
Computing HOSVD...
Size of core: 7 x 7 x 7
||X-T||/||X|| = 2.78285e-15 <=0.000000 (tol)

HOSVD...
Computing HOSVD...
Size of core: 7 x 7 x 7
||X-T||/||X|| = 2.27855e-15 <=0.000000 (tol)

But ST-HOSVD may be slightly faster than HOSVD for larger tensors.

fprintf('Time for 10 runs of ST-HOSVD:\n');
tic, for i =1:10, T = hosvd(Y,2*sqrt(eps),'verbosity',0); end; toc

fprintf('Time for 10 runs of HOSVD:\n');
tic, for i =1:10, T = hosvd(Y,2*sqrt(eps),'verbosity',0,'sequential',false); end; toc
Time for 10 runs of ST-HOSVD:
Elapsed time is 0.194390 seconds.
Time for 10 runs of HOSVD:
Elapsed time is 0.405164 seconds.