spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython.
spaCy is a way to prepare text for deep learning. It interoperates with TensorFlow, PyTorch, scikit-learn, Gensim and the rest of Python's AI ecosystem. With spaCy, you can construct linguistically sophisticated statistical models for a variety of NLP problems.
Learn more from small training corpora by initializing your models with knowledge from raw text. The new pretrain command teaches spaCy's CNN model to predict words based on their context, producing representations of words in contexts. It's still experimental, but users are already reporting good results.
In 2015, independent researchers from Emory University and Yahoo! Labs showed that spaCy offered the fastest syntactic parser in the world and that its accuracy was within 1% of the best available (Choi et al., 2015). spaCy v2.0, released in 2017, is more accurate than any of the systems Choi et al. evaluated.