Non-negative matrix factorization (NMF)

Other attributes

Wikidata ID

Non-negative Matrix Factorization (NMF or NNMF) is a matrix factorization method where all of values in matrices are constrained to be non-negative so that they are easier to inspect.

Matrix factorization is also sometimes referred to as matrix decomposition because it is the process of taking one matrix (V) and finding two smaller matrices (W and H) such that their product approximates the original matrix.

When NMF is used in machine learning, W is the weights matrix and H is the features matrix. This means that W has a column for each feature and a row for each row in the original matrix, while H has a row for each feature and a column for each column in the original matrix. The dot product of W and H then produces the sum of the latent features by the weights for each element in the original matrix.

The utility of NMF in machine learning is that it can be used to predict values that were zero values in the original matrix. NMF is closely related to both supervised learning and unsupervised learning methodologies, particularly the latter.

One of the common techniques used to perform matrix factorization is called gradient descent, which itself has many variations such as SGD, Momentum method, Adagrad, Adadelta, RMSprop, Adam, and others.

Another widely used technique is singular value decomposition (SVD), which includes the Funk SVD, SVD++, Iterative SVD, Regularized SVD, and Asymmetric SVD algorithms. Funk SVD was first introduced in a 2006 blog post by Simon Funk titled Netflix Update: Try This at Home, in which Funk outlined a method that could be used by Netflix to recommend relevant shows and movies to its users.

NMF Methods for Clustering

Several NMF variants have been shown to be well-performing alternatives to clustering algorithms. Clustering, or cluster analysis, is an unsupervised learning problem. Clustering algorithms aim to find groups of a data set such that the similarity within each group is maximized and the similarity between different groups is minimized.

Researchers have developed a variety of difference methods in applying NMF for clustering, including Sparse NMF, Projective NMF, Non-negative Spectral Clustering, and Cluster-NMF.