# Weighted optimization for CP tensor decomposition with incomplete data

We explain how to use the CP Weighted Optimization (CP-WOPR) method implements in `cp_wopt`. The method is described in the following article:

- E. Acar, D. M. Dunlavy, T. G. Kolda and M. Mørup,
**Scalable Tensor Factorizations for Incomplete Data**,*Chemometrics and Intelligent Laboratory Systems*106(1):41-56, March 2011 (doi:10.1016/j.chemolab.2010.08.004)

## Contents

- Third-party optimization software
- Important Information
- Create an example problem with missing data.
- Create initial guess using 'nvecs'
- Call the
`cp_wopt`method - Check the output
- Evaluate the output
- Create a SPARSE example problem with missing data.
- Create initial guess using 'nvecs'
- Call the
`cp_wopt`method - Check the output
- Evaluate the output

## Third-party optimization software

The `cp_wopt` method uses third-party optimization software to do the optimization. You can use either

The remainder of these instructions assume L-BFGS-B is being used. See here for instructions on using `cp_wopt` with Poblano.

## Important Information

It is critical to zero out the values in the missing entries of the data tensor. This can be done by calling `cp_wopt(X.*P,P,...)`. This is a frequent source of errors in using this method.

## Create an example problem with missing data.

Here we have 25% missing data and 10% noise.

R = 2; info = create_problem('Size', [15 10 5], 'Num_Factors', R, ... 'M', 0.25, 'Noise', 0.10); X = info.Data; P = info.Pattern; M_true= info.Soln;

## Create initial guess using 'nvecs'

M_init = create_guess('Data', X, 'Num_Factors', R, ... 'Factor_Generator', 'nvecs');

## Call the `cp_wopt` method

Here is an example call to the cp_opt method. By default, each iteration prints the least squares fit function value (being minimized) and the norm of the gradient.

```
[M,~,output] = cp_wopt(X, P, R, 'init', M_init);
```

Running CP-WOPT... Time for zeroing out masked entries of data tensor is 4.39e-04 seconds. (If zeroing is done in preprocessing, set 'skip_zeroing' to true.) Iter 10, f(x) = 1.299287e+01, ||grad||_infty = 5.84e+00 Iter 20, f(x) = 9.896947e-01, ||grad||_infty = 4.82e-02 Iter 30, f(x) = 9.893514e-01, ||grad||_infty = 9.85e-05 Iter 32, f(x) = 9.893514e-01, ||grad||_infty = 6.39e-05

## Check the output

It's important to check the output of the optimization method. In particular, it's worthwhile to check the exit message for any problems. The message `CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH` means that it has converged because the function value stopped improving.

exitmsg = output.ExitMsg

exitmsg = 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH.'

## Evaluate the output

We can "score" the similarity of the model computed by CP and compare that with the truth. The `score` function on ktensor's gives a score in [0,1] with 1 indicating a perfect match. Because we have noise, we do not expect the fit to be perfect. See doc score for more details.

scr = score(M,M_true)

scr = 0.9991

## Create a SPARSE example problem with missing data.

Here we have 95% missing data and 10% noise.

R = 2; info = create_problem('Size', [150 100 50], 'Num_Factors', R, ... 'M', 0.95, 'Sparse_M', true, 'Noise', 0.10); X = info.Data; P = info.Pattern; M_true= info.Soln;

## Create initial guess using 'nvecs'

M_init = create_guess('Data', X, 'Num_Factors', R, ... 'Factor_Generator', 'nvecs');

## Call the `cp_wopt` method

```
[M,~,output] = cp_wopt(X, P, R, 'init', M_init);
```

Running CP-WOPT... Time for zeroing out masked entries of data tensor is 3.66e-02 seconds. (If zeroing is done in preprocessing, set 'skip_zeroing' to true.) Iter 10, f(x) = 1.895160e+02, ||grad||_infty = 2.96e+01 Iter 20, f(x) = 1.711120e+02, ||grad||_infty = 8.23e-03 Iter 21, f(x) = 1.711120e+02, ||grad||_infty = 1.18e-03

## Check the output

exitmsg = output.ExitMsg

exitmsg = 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH.'

## Evaluate the output

scr = score(M,M_true)

scr = 0.9995