Data labeling is the process of identifying raw data and adding informative labels to provide context so that a machine learning model can learn from it.