Sparse principal component analysis (Sparse PCA) is an extension of the classic principal component analysis (PCA) method that offers dimensionality reduction of data with better statistical properties and interpretability than classic PCA.
One of the disadvantages of classic PCA is that the principal components are linear combinations of all variables. In other words, the principal components depend on all of the original variables. Sparce PCA extends traditional PCA by finding linear combinations that contain only a few input variables.
For some problems, this means that Sparce PCA will produce similar results as traditional PCA, but with simpler and more interpretable components.
For example, in "A Direct Formulation for Sparse PCA Using Semidefinite Programming" (D'Aspremont et al. (2007)), 500 genes were measures for a large number of samples. With traditional PCA, the factors obtained each use all 500 genes, making the results difficult to interpret. Using Sparse PCA, the factors altogether only involved 14 genes and the data was more interpretable.
Everything you did and didn't know about PCA · Its Neuronal
How exactly is sparse PCA better than PCA?
Optimal Sparse Linear Auto-Encoders and Sparse PCA
Malik Magdon-Ismail, Christos Boutsidis
Sparse PCA in High Dimensions
December 18, 2013
Sparse Principal Component Analysis
Hui Zou, Trevor Hastie, Robert Tibshirani