Unsupervised learning

Unsupervised learning

A branch of machine learning that tries to make sense of data that has not been labeled, classified, or categorized by extracting features and patterns on its own.

Unsupervised learning is a branch of machine learning that takes unlabeled data that hasn't been previously classified or categorized and tries to extract features and patterns from the data on its own. Where supervised learning is analogous to taking a multiple choice test with pre-determined answer key, unsupervised learning is analogous to taking an open-ended test where the questions don't have an answer key or objective means of determining a grade.

The general goal of unsupervised learning is to gain some insights about a given data set by modeling the underlying structure or distribution in the data. Unsupervised learning algorithms aren't searching for concrete correct answers or specific outputs. Rather, they are handed a dataset without having any explicit instructions on what to do, and they are left alone to find interesting structure in the data.

Types of Unsupervised Learning

The different unsupervised learning models that exist can be categorized based on the ways in which they organize data.

  • Clustering - Identifying and grouping similar data points together. Variations include k-means, k-means++, hierarchical clustering, density clustering, spectral clustering, and more.
  • Data compression / dimensionality reduction - Identifying and removing redundant data from a data set so that most of the important information can be represented with only a faction of the actual content, saving on computing power and storage costs. These methods include nonlinear dimensionality reduction (NDR), non-negative matrix factorization (NMF), singular value decompostion (SVD), as well as principal component analysis (PCA) and variations of PCA such as kernel PCA and sparse PCA.
  • Anomaly detection - Identifying unusual patterns that do not conform to expected behavior. There are several types of anomaly detection that can be used for different purposes. They include: clustering-based methods such as k-means; support vector machine-based methods; density-based methods such as k-NN or local outlier factor (LOF).
  • Association - Discovering interesting relationships between variables in large data sets. Well known association algorithms include Apriori and Eclat.

Timeline

People

Name
Role
LinkedIn

Further reading

Title
Author
Link
Type
Date

Clustering Based Unsupervised Learning

Syed Sadat Nazrul

Web

Machine Learning for Humans, Part 3: Unsupervised Learning

Vishal Maini

Web

NVIDIA Blog: Supervised Vs. Unsupervised Learning

Isha Salian

Web

Documentaries, videos and podcasts

Title
Date
Link

Companies

Company
CEO
Location
Products/Services

References

Golden logo
Text is available under the Creative Commons Attribution-ShareAlike 4.0; additional terms apply. By using this site, you agree to our Terms & Conditions.