An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning

Gilad, Gal and Sason, Itay and Sharan, Roded (2021) An automated approach for determining the number of components in non-negative matrix factorization with application to mutational signature learning. Machine Learning: Science and Technology, 2 (1). 015013. ISSN 2632-2153

[thumbnail of Gilad_2021_Mach._Learn.__Sci._Technol._2_015013.pdf] Text
Gilad_2021_Mach._Learn.__Sci._Technol._2_015013.pdf - Published Version

Download (708kB)

Abstract

Non-negative matrix factorization (NMF) is a popular method for finding a low rank approximation of a matrix, thereby revealing the latent components behind it. In genomics, NMF is widely used to interpret mutation data and derive the underlying mutational processes and their activities. A key challenge in the use of NMF is determining the number of components, or rank of the factorization. Here we propose a novel method, CV2K, to choose this number automatically from data that is based on a detailed cross validation procedure combined with a parsimony consideration. We apply our method for mutational signature analysis and demonstrate its utility on both simulated and real data sets. In comparison to previous approaches, some of which involve human assessment, CV2K leads to improved predictions across a wide range of data sets.

Item Type: Article
Subjects: Souths Book > Multidisciplinary
Depositing User: Unnamed user with email support@southsbook.com
Date Deposited: 03 Jul 2023 04:58
Last Modified: 14 Sep 2024 04:44
URI: http://research.europeanlibrarypress.com/id/eprint/1329

Actions (login required)

View Item
View Item