Tweetscraper

TweetScraper is a simple crawler/spider for Twitter Search without using API
Alternatives To Tweetscraper
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
T5,4108924 months ago60December 24, 2016182mitRuby
A command-line power tool for Twitter.
Twitterscraper1,8522212 years ago47July 28, 2020136mitPython
Scrape Twitter for Tweets
Apps.loklak.org1,080
4 years ago14lgpl-2.1JavaScript
loklak apps site http://apps.loklak.org
Getoldtweets Python1,053
3 years ago160mitPython
A project written in Python to get old tweets, it bypass some limitations of Twitter Official API.
Twitter Advanced Search885
3 months ago4
Advanced Search for Twitter.
Rtweet778299a day ago10May 19, 201918otherR
🐦 R client for interacting with Twitter's [stream and REST] APIs
Search Tweets Python73822 months ago10July 01, 202120mitPython
Python client for the Twitter 'search Tweets' and 'count Tweets' endpoints (v2/Labs/premium/enterprise). Now supports Twitter API v2 /recent and /all search endpoints.
Tweetscraper698
2 years ago8April 29, 20181gpl-2.0Python
TweetScraper is a simple crawler/spider for Twitter Search without using API
Chatterbot495
5924 days ago35May 29, 2021mitRuby
A straightforward ruby-based Twitter Bot Framework, using OAuth to authenticate.
Emailharvester399
3 years agogpl-3.0Python
Email addresses harvester
Alternatives To Tweetscraper
Select To Compare


Alternative Project Comparisons
Readme

Introduction

TweetScraper can get tweets from Twitter Search. It is built on Scrapy without using Twitter's APIs. The crawled data is not as clean as the one obtained by the APIs, but the benefits are you can get rid of the API's rate limits and restrictions. Ideally, you can get all the data from Twitter Search.

WARNING: please be polite and follow the crawler's politeness policy.

Installation

  1. Install conda, you can get it from miniconda. The tested python version is 3.7.

  2. Install selenium python bindings: https://selenium-python.readthedocs.io/installation.html. (Note: the KeyError: 'driver' is caused by wrong setup)

  3. For ubuntu or debian user, run:

    $ bash install.sh
    $ conda activate tweetscraper
    $ scrapy list
    $ #If the output is 'TweetScraper', then you are ready to go.
    

    the install.sh will create a new environment tweetscraper and install all the dependencies (e.g., firefox-geckodriver and firefox),

Usage

  1. Change the USER_AGENT in TweetScraper/settings.py to identify who you are

     USER_AGENT = 'your website/e-mail'
    
  2. In the root folder of this project, run command like:

     scrapy crawl TweetScraper -a query="foo,#bar"
    

    where query is a list of keywords seperated by comma and quoted by ". The query can be any thing (keyword, hashtag, etc.) you want to search in Twitter Search. TweetScraper will crawl the search results of the query and save the tweet content and user information.

  3. The tweets will be saved to disk in ./Data/tweet/ in default settings and ./Data/user/ is for user data. The file format is JSON. Change the SAVE_TWEET_PATH and SAVE_USER_PATH in TweetScraper/settings.py if you want another location.

Acknowledgement

Keeping the crawler up to date requires continuous efforts, please support our work via opencollective.com/tweetscraper.

License

TweetScraper is released under the GNU GENERAL PUBLIC LICENSE, Version 2

Popular Search Projects
Popular Twitter Projects
Popular Computer Science Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Search
Twitter
Crawler
Tweets
Spider
Scrapy