Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Scrapy | 49,918 | 4,185 | 445 | 5 months ago | 96 | September 18, 2023 | 692 | bsd-3-clause | Python | |
Scrapy, a fast high-level web crawling & scraping framework for Python. | ||||||||||
Huginn | 42,091 | 69 | 52 | 11 days ago | 8 | September 22, 2017 | 698 | mit | Ruby | |
Create agents that monitor and act on your behalf. Your agents are standing by! | ||||||||||
Crawlee | 12,871 | 42 | 2 days ago | 747 | December 10, 2023 | 96 | apache-2.0 | TypeScript | ||
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation. | ||||||||||
Awesome Web Scraping | 6,060 | 7 months ago | 1 | other | Makefile | |||||
List of libraries, tools and APIs for web scraping and data processing. | ||||||||||
Awesome Crawler | 5,859 | 7 months ago | 27 | mit | ||||||
A collection of awesome web crawler,spider in different languages | ||||||||||
Autoscraper | 5,159 | 1 | a year ago | 16 | July 17, 2022 | 9 | mit | Python | ||
A Smart, Automatic, Fast and Lightweight Web Scraper for Python | ||||||||||
Douyin_tiktok_download_api | 4,844 | 7 months ago | 21 | September 23, 2023 | 60 | mit | Python | |||
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。 | ||||||||||
Rod | 4,505 | 140 | 5 months ago | 406 | November 06, 2023 | 106 | mit | Go | ||
A Devtools driver for web automation and scraping | ||||||||||
Node Osmosis | 4,083 | 218 | 58 | a year ago | 27 | March 01, 2019 | 117 | JavaScript | ||
Web scraper for NodeJS | ||||||||||
Browser Fingerprinting | 3,353 | a year ago | 7 | JavaScript | ||||||
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web? |