Building Pipelines for Natural Language Understanding with Spark

Video description

The course is designed for engineers and data scientists who have some familiarity with Scala, Apache Spark, and machine learning who need to process large natural language text in a distributed fashion.We will use sample of posts from the subreddit /r/WritingPrompts, which contains short stories and comments about the short stories.The course has four parts1. Building a natural language processing and entity extraction pipeline on Scala & Spark2. Machine Learning Applications for Statistical Natural Language Understanding at Scale3. Topic Modeling on Natural Language with Scala, Spark and MLLib4. Deep Learning Applications for Natural Language Understanding with Scala, Spark and MLLibYou will learn how use Apache Spark to process text with annotations, use machine learning with your annotations, create and use topic models, create and use a word2vec model.

Publisher resources

View/Submit Errata

Product information

  • Title: Building Pipelines for Natural Language Understanding with Spark
  • Author(s): David Talby, Alex Thomas
  • Release date: December 2016
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491978122