Computing Reviews

Bayesian multi-tensor factorization
Khan S., Leppäaho E., Kaski S. Machine Learning105(2):233-253,2016.Type:Article
Date Reviewed: 02/23/17

Data mining is increasingly facing the problem of extracting new knowledge from experimental data collected from complex phenomena. To extract hidden information, such datasets can be decomposed into the components that underlie them. Because data are usually stored as multidimensional arrays or tensors, algebraic methods can be applied, as the low-rank matrix and tensor factorization. For mixed and partially linked datasets, as in the case of microarrays experiments, joint factorization of multiple matrices is needed. Methods for coupled multimatrix tensor factorization (MTF) are under development.

This paper proposes a new MTF method able to deal with multiple paired tensors and matrices having in common a set of samples. The novelty is integrating a probabilistic Bayesian formulation, and a new result is the relaxed factorization that allows decomposing the tensor in a flexible way. The authors show how the various parameters can be manually or automatically tuned to specific tasks.

The final part of the paper presents various experiments on synthetic and real experimental data, including neuroimaging and toxicogenomics, and discusses the results in quantitative and qualitative terms. The authors claim, supported by those results, that the method is effective in finding shared patterns in data, thus creating new hypotheses to explore. If supported by more evidence, this would be perhaps the best indication of the value of the new algorithm.

This paper continues in the direction of increasing the mix of machine learning and algebraic and statistical tools, where mathematical formulations are an alternative to other knowledge-intensive methods. It addresses data mining and machine learning researchers; while the applications are important for biomedical researchers, the paper is not in their language.

In conclusion, Bayesian multitensor factorization is perhaps a very good tool for real data analysis; however, its implementation does not seem straightforward, and no time performance data are reported.

Reviewer:  G. Gini Review #: CR145079 (1705-0287)

Reproduction in whole or in part without permission is prohibited.   Copyright 2024 ComputingReviews.com™
Terms of Use
| Privacy Policy