Twitter users often associate and socialize with other users based on similar interests. The Tweets of these users can be classified using a trained LDA model to automate the discovery of their similarities.
Python 2.7 is recommended since the pattern library is currently incompatible with most Python 3 versions.
Python 3.6 can be used with the pattern library, though it may need to be built from source since most newer Linux distributions don't come with it pre-installed. The commands to build Python 3.6 from source are provided in the linux_setup_py3.6.sh script.
git clone https://github.com/kethort/twitter_LDA_topic_modeling.git
Run bash script:
Python pip requirements included in these files:
# for Python 2.7 pip install -r requirements_py2.txt # for Python 3 pip install -r requirements_py3.txt
Link to the simple-wikipedia dump:
The installation is very similar to the linux installation:
extra install instructions in osx_setup_py3.6.info pip install -r requirements_py3_OSX.txt