Spark NLP becomes the world\’s most widely used NLP library in the enterprise within 18 months

Spark NLP becomes the world\\\'s most widely used NLP library in the enterprise within 18 months
O’Reilly’s “AI Adoption in the Enterprise” survey also reports that Spark NLP is the 5th most widely used ML library overall

The annual O’Reilly report on AI Adoption in the Enterprise, released in February 2019, is a survey of 1,300 practitioners in multiple industry verticals, which asked respondents about revenue-bearing AI projects their organizations have in production and also to list all the ML or AI frameworks and tools which they use.

Spark NLP library was listed as the 5th  most popular across all AI frameworks – following only scikit-learn, TensorFlow, keras, and PyTorch. It was also by far the most widely used NLP library – twice as common as spaCy, which was the closest on this ranking.

John Snow Labs SPARK NLP main benefits

  • Accuracy. More accurate than spaCy, Stanford CoreNLP, nltk, and OpenNLP, due to implementation of recent deep learning networks and embeddings
  • Speed. NLP pipelines can run 2-3 orders of magnitude faster for training of custom NLP models

  • Scalability. Built on Apache Spark ML, Spark NLP can scale on any Spark cluster, on-premise or in any cloud provider. 
  • Production-grade codebase. Built for enterprises, in contrast to research-oriented libraries like AllenNLP and NLP Architect.
  • Permissive open source license. The library can be used freely, including  in a commercial setting. 

  • Full Python, Java and Scala APIs. Supporting multiple programming languages and enables to take advantage of the implemented models without having to move data. 
  • Frequent Releases. Released about twice a month – there were 26 new releases in 2018


John Snow Labs SPARK NLP 2.0 – the biggest release to date

This Spark NLP 2.0 release merges 50 pull requests, improving accuracy and ease and use. It’s the largest single release since the library was first introduced.

Spark NLP is the first library to have a production-ready implementation of BERT embeddings for named entity recognition. Here are the biggest enhancements in this release:

  • Revamped and enhanced Named Entity Recognition (NER) Deep Learning models to a new state of the art level, reaching up to 93% F1 micro-averaged accuracy in the industry standard.
  • Word Embeddings as well as Bert Embeddings are now annotators
  • TensorFlow version upgrade and use of contrib LSTM Cells
  • Performance and memory usage improvements 
  • Revamped and expanded pre-trained pipelines list, and new pre-trained models for different languages and new example notebooks
  • OCR module improvements for increased accuracy.


About John Snow Labs

John Snow Labs Inc. is an award-winning healthcare AI company, accelerating progress in data science by taking on the headache of managing platforms, models and data. A third of the team have a PhD or MD degree and 75% of team members have at least a Master’s, coming from multiple disciplines covering data science, medicine, data engineering, pharma, security, and DataOps. A Delaware Corporation, John Snow Labs runs as a global virtual team located in 16 countries around the globe. The company believes in being great partners, in making customers wildly successful, and in using data philanthropy to make the world a better place.

Media Contact
Company Name: John Snow Labs
Contact Person: Ida Lucente
Email: Send Email
Phone: +1 (302) 786-5227
Address:16192 Coastal Highway
City: Lewes
State: Delaware 19958
Country: United States
Website: www.johnsnowlabs.com