Awesome Open Source
Awesome Open Source


Building an end-to-end NLP pipeline for small teams to do user research with Twitter data

Project Intro/Objective

See the Wiki! This project is a part of the Data Science Working Group at Code for San Francisco. Other DSWG projects can be found at the main GitHub repo.


Please refer to this article for how these folders should work together.

The "/main" folder is for production code and has 4 sub folders:

  • /data
  • /code
  • /pipeline
  • /output

Use "/sandbox" folder for storing experiments and playing around. "/outreach" is for organizing materials for producing presentations.

-- Project Status: [In Discovery]

Methods Used


  • Python
  • Spacy
  • scikit-learn
  • gensim


Contributing NLTweets Members

Name Slack Handle
Daniel Zou @daniel.zou
Josh Freivogel @Josh Freivogel
Nathan Chau @Nathan Chau


  • If you haven't joined the SF Brigade Slack, you can do that here.
  • Our slack channel is #nltweets
  • Feel free to contact team leads with any questions or if you are interested in contributing!

Get A Weekly Email With Trending Projects For These Topics
No Spam. Unsubscribe easily at any time.
python (54,340
jupyter-notebook (6,286
machine-learning (3,626
social-media (80