Factor analysis

Other attributes

Wikidata ID

Factor analysis is an exploratory/descriptive method for modeling observed variables and their variance/covariance structures in terms of a smaller number of underlying, unobservable (i.e. latent) factors.

Factors are commonly viewed as broad concepts/ideas that can describe an observed phenomenon. As a result, use of factor analysis can be controversial because the method is somewhat subjective and open to flexible interpretations.

For example, a strong desire to obtain social status may explain why many people exercise, but how much this factor influences motivation and discipline to exercise depends on the individual and can't be accurately measured quantitatively. There are also other factors that might influence how much somebody exercises, as well as other observable ways that people might try to obtain social status other than exercising.

Data variables for the example above could be:

How much money do you spend on gym memberships and fitness equipment?
How much time do you spend exercising each day?
Do you prefer to exercise alone or with another person / group?
Etc.

The answers to these questions are specific and quantifiable, whereas the underlying factors that might influence them are not. In factor analysis, observed variables are modeled as linear functions of the "factors".

Factor Analysis in Machine Learning

Factor analysis is a technique sometimes used in machine learning to gain insights about the connections and interdependencies between observed variables in a data set. A common application is the case in which data sets have a large amount of observed variables and it is thought that they can be condensed to reflect a smaller number of underlying variables.