Awesome Open Source
Awesome Open Source


Hands-On NLTK Tutorial

The hands-on NLTK tutorial in the form of Jupyter notebooks

NLTK is one of the most popular Python packages for Natural Language Processing (NLP).

Index of Jupyter Notebooks

1.1 Downloading Libs and Testing That They Are Working
Getting ready to start!
1.2 Text Analysis Using nltk.text
Extracting interesting data from a given text
2.1 Deriving N-Grams from Text
Creating n-grams (for language classification)
2.2 Detecting Text Language by Counting Stop Words.ipynb
A simple way to find out what language a text is written in
2.3 Language Identifier Using Word Bigrams
State-of-the-art language classifier
3.1 Bigrams, Stemming and Lemmatizing
NLTK makes bigrams, stemming and lemmatization super-easy
3.2 Finding Unusual Words in Given Language
Which words do not belong with the rest of the text?
3.3 Creating a POS Tagger
Creating a Parts Of Speech tagger
3.4 Parts of Speech and Meaning
Exploring awesome features offered by WordNet
4.1 Name Gender Identifier
Building a classifier that guesses the gender of a name
4.2 Classifying News Documents into Categories
Building a classifier that guesses the category of a news item
5.1 Sentiment Analysis
Is a movie review positive or negative?
5.2 Sentiment Analysis with nltk.sentiment.SentimentAnalyzer and VADER tools
More sentiment analysis!
6.1 Twitter Stream (and Cleaning Tweets)
Live-stream tweets from Twitter
6.2 Twitter Search
Search through past tweets
7.1 NLTK with the Greek Script
Using NLTK with foreign scripts
8.1 The langdetect and langid Libraries
Useful libraries for language identification
8.2 Word2Vec (gensim)
Google's Word2vec


H. Z. Sababa — hb20007 — [email protected]

Distributed under the MIT license. See LICENSE for more information.

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
Jupyter Notebook (231,533
Jupyter Notebook (231,533
Nlp (7,911
Tutorial (7,016
Tutorial (7,016
Jupyter (1,714
Notebook (1,185
Notebook (1,185
Nlp Machine Learning (1,167
Nltk (778
Binder (298
Related Projects