Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python scraper
python
x
scraper
x
1,373 search results found
Scrapy
⭐
49,918
Scrapy, a fast high-level web crawling & scraping framework for Python.
Newspaper
⭐
13,147
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Requests Html
⭐
13,100
Pythonic HTML Parsing for Humans™
Chinese Xinhua
⭐
10,425
📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
Portia
⭐
8,982
Visual scraping for Scrapy
Undetected Chromedriver
⭐
7,232
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Autoscraper
⭐
5,159
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Douyin_tiktok_download_api
⭐
4,844
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、T
Mygptreader
⭐
4,267
A community-driven way to read and chat with AI bots - powered by chatGPT.
Python Scraping
⭐
4,136
Code samples from the book Web Scraping with Python http://shop.oreilly.com/product/0636920034391.do
Snscrape
⭐
3,992
A social networking service scraper in Python
Data Science
⭐
3,898
Collection of useful data science topics along with articles, videos, and code
Fake Useragent
⭐
3,356
Up-to-date simple useragent faker with real world database
Automatic Udemy Course Enroller Get Paid Udemy Courses For Free
⭐
3,010
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
Python
⭐
2,978
Python Books && Courses
Tweets_analyzer
⭐
2,894
Tweets metadata scraper & activity analyzer
Googlescraper
⭐
2,540
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Snoop
⭐
2,530
Snoop — инструмент разведки на основе открытых данных (OSINT world)
Trafilatura
⭐
2,447
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Grab
⭐
2,292
Web Scraping Framework
Weibo_terminater
⭐
2,265
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Bulk Downloader For Reddit
⭐
2,142
Downloads and archives content from reddit
Facebook Page Post Scraper
⭐
2,014
Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
Facebook Scraper
⭐
1,936
Scrape Facebook public pages without an API key
Twitterscraper
⭐
1,852
Scrape Twitter for Tweets
Linkedin_scraper
⭐
1,534
A library that scrapes Linkedin for user data
Jobfunnel
⭐
1,533
Scrape job websites into a single spreadsheet with no duplicates.
Recipe Scrapers
⭐
1,408
Python package for scraping recipes data
Jd Autobuy
⭐
1,291
Python爬虫,京东自动登录,在线抢购商品
Shot Scraper
⭐
1,285
A command-line utility for taking automated screenshots of websites
Cloudproxy
⭐
1,235
Hide your scrapers IP behind the cloud. Provision proxy servers across different cloud providers to improve your scraping success.
Gdom
⭐
1,180
DOM Traversing and Scraping using GraphQL
Cinemagoer
⭐
1,156
Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies
Informer
⭐
1,141
A Telegram Mass Surveillance Bot in Python
Scrapy Cluster
⭐
1,137
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Event Collect
⭐
1,110
event website listing to Open Event format scraper and converter
Animdl
⭐
1,105
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
Fansly Downloader
⭐
1,103
Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts 👍
Django Dynamic Scraper
⭐
1,069
Creating Scrapy scrapers via the Django admin interface
Scanless
⭐
1,061
online port scan scraper
Parliament Scraper
⭐
1,049
Public Data Scraper for Parliament Data for the EU and other Parliaments
Redditdownloader
⭐
1,045
Scrapes Reddit to download media of your choice.
Crawler User Agents
⭐
1,045
Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome ⭐
Osi.ig
⭐
1,027
Information Gathering Instagram.
Parsel
⭐
1,010
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Finviz
⭐
963
Unofficial API for finviz.com
Querido Diario
⭐
944
📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.
Mlscraper
⭐
935
🤖 Scrape data from HTML websites automatically by just providing examples
Instagram Crawler
⭐
922
Get Instagram posts/profile/hashtag data without using Instagram API
Clean Text
⭐
810
🧹 Python package for text cleaning
Scrapyrt
⭐
793
HTTP API for Scrapy spiders
Bot
⭐
790
Completely free and open-source human-like Instagram bot. Powered by UIAutomator2 and compatible with basically any Android device 5.0+ that can run Instagram - real or emulated.
Loconotion
⭐
775
📄 Python tool to turn Notion.so pages into lightweight, customizable static websites
Amazon Scraper Python
⭐
766
Non-official client to get some info about products sold on Amazon
Lulu
⭐
752
[Unmaintained] A simple and clean video/music/image downloader 👾
Scweet
⭐
720
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
Gazpacho
⭐
716
🥫 The simple, fast, and modern web scraping library
Domain
⭐
713
Setup script for Regon-ng
Edu Mail Generator
⭐
707
Generate Free Edu Mail(s) within minutes
Episode Rename
⭐
699
电视剧/番剧自动化重命名工具, 一键批量改名. 可配合QBittorrent下载后自动重命名, 方便Emby自动刮削. 支持Windows, Linux, MacOS, Docker 和 群晖套件环境运行
Bookcorpus
⭐
698
Crawl BookCorpus
Pdfquery
⭐
693
A fast and friendly PDF scraping library.
Google Play Scraper
⭐
645
Google play scraper for Python inspired by <facundoolano/google-play-scraper>
Dataengineeringproject
⭐
644
Example end to end data engineering project.
Tiktoklive
⭐
623
Python library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
Easy Scraping Tutorial
⭐
618
Simple but useful Python web scraping tutorial code.
Kuwala
⭐
610
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demograp
Urs
⭐
604
Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool.
Linkedin
⭐
602
Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy
Imagescraper
⭐
572
✂️ High performance, multi-threaded image scraper
Instascrape
⭐
554
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Dryscrape
⭐
523
[not actively maintained] A lightweight Python library that uses Webkit to enable easy scraping of dynamic, Javascript-heavy web pages
Social Media Profiles Regexs
⭐
508
📇 Extract social media profiles and more with regular expressions
Alltheplaces
⭐
502
A set of spiders and scrapers to extract location information from places that post their location on the internet.
Complete Life Cycle Of A Data Science Project
⭐
499
Complete-Life-Cycle-of-a-Data-Science-Project
Comic Dl
⭐
498
Comic-dl is a command line tool to download manga and comics from various comic and manga sites. Supported sites : readcomiconline.to, mangafox.me, comic naver and many more.
Jekyll
⭐
498
Jekyll-based static site for The Programming Historian
Goop
⭐
488
Google Search Scraper
Spidermon
⭐
486
Scrapy Extension for monitoring spiders execution.
Twitter_scraping
⭐
479
Grab all a user's tweets (and get past 3200 limit)
Telegram Members Adder
⭐
475
Telegram Members Adding Software/Script Using Termux.
Cryptocmd
⭐
472
Cryptocurrency historical price data library in Python. Data from https://coinmarketcap.com.
Tinderbotz
⭐
468
Automated Tinder bot and scraper using selenium in python.
Openwebtext
⭐
463
Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.
Pywebcopy
⭐
455
Locally saves webpages to your hard disk with images, css, js & links as is.
Scrapple
⭐
452
A framework for creating semi-automatic web content extractors
Awesome Scrapy
⭐
450
A curated list of awesome packages, articles, and other cool resources from the Scrapy community.
Pdf.tocgen
⭐
444
A CLI toolset to generate table of contents for PDF files automatically.
Auto Archiver
⭐
439
Automatically archive links to videos, images, and social media content from Google Sheets (and more).
Mdcx
⭐
435
Movie metadata scraper
Google Search Results Python
⭐
432
Google Search Results via SERP API pip Python Package
Covid_19
⭐
428
COVID19 case numbers of Cantons of Switzerland and Principality of Liechtenstein (FL). The data is updated at best once a day (times of collection and update may vary). Start with the README.
Newsdiffs
⭐
418
Automatic scraper that tracks changes in news articles over time.
Search Engine Parser
⭐
416
Lightweight package to query popular search engines and scrape for result titles, links and descriptions
Fbcrawl
⭐
415
A Facebook crawler
Docker Selenium Lambda
⭐
402
The simplest demo of chrome automation by python and selenium in AWS Lambda
Dude
⭐
397
dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators
Scavenger
⭐
384
Crawler (Bot) searching for credential leaks on paste sites.
Basketball_reference_web_scraper
⭐
382
NBA Stats API via Basketball Reference
Search Engines Scraper
⭐
377
Search google, bing, yahoo, and other search engines with python
Related Searches
Python Django (26,307)
Python Machine Learning (20,195)
Python Dataset (14,792)
Python Flask (14,408)
Python Docker (13,757)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Network (11,646)
Python Algorithms (10,033)
1-100 of 1,373 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.