Difficulty: beginner
Estimated Time: 20 minutes

Natural Language Processing aggregates several tasks that can be performed, like:

  • Part of speech tagging
  • Word segmentation
  • Named entity recognition
  • Machine translation
  • Question answering
  • Sentiment analysis
  • Topic segmentation and recognition
  • Natural language generation

It all starts though with preparing text for further processing. In this lab you will learn how to use some of the NLTK capabilities to clean and prepare text data.

You've completed Introduction to NLTK scenario.

Don’t stop now! The next scenario will only take about 10 minutes to complete.

Introduction to NLTK

Step 1 of 4

Read data

To start working with Python use the following command:

python

In this scenario we will be working with the NLTK library. In some way we will repeat some work from the previous scenario using the library instead of vanilla Python.

Let's read movie reviews again.

import data_reader documents = data_reader.read_reviews()

Then we can look as an example document (feel free to change the index and load different document).

example_idx = 123 document = documents[example_idx] document