Word2vec is a model for learning vector representation of words called word embeddings. It transforms words in numerical form which then can be used in natural language processing and machine learning applications.
Word2Vec computes distributed vector representation of words. Distributed representations make the generalization to novel patterns easier and model estimation more robust. Distributed vector representation is used in natural language processing applications such as named entity recognition, disambiguation, parsing, tagging and machine translation.
Word2vec is a computationally effective predictive model for learning word embeddings from raw text. It uses unsupervised learning models, the Continuous Bag-of-Words model (CBOW) and the Skip-Gram model. Algorithmically, these models are the same, except that CBOW predicts target words from source context words, while the skip-gram does the inverse and predicts source context-words from the target words.
Timeline
Further Resources
Caviar’s Word2Vec Tagging For Menu Item Recommendations
Christopher Skeels and Yash Patel
Corpus specificity in LSA and Word2vec: the role of out-of-domain documents
Edgar Altszyler, Mariano Sigman, Diego Fernandez Slezak
Academic paper
Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler
Gregory Grefenstette TAO, Lawrence Muchemi TAO
Academic paper
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, Jeffrey Dean
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean