Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Scrapy | 49,918 | 4,185 | 445 | 3 months ago | 96 | September 18, 2023 | 692 | bsd-3-clause | Python | |
Scrapy, a fast high-level web crawling & scraping framework for Python. | ||||||||||
Crawlee | 12,158 | 42 | 6 days ago | 747 | December 10, 2023 | 96 | apache-2.0 | TypeScript | ||
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation. | ||||||||||
Crawlab | 10,521 | 4 months ago | 1 | March 03, 2019 | 58 | bsd-3-clause | Go | |||
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架 | ||||||||||
Spider Flow | 8,075 | a year ago | 20 | mit | Java | |||||
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。 | ||||||||||
Katana | 7,995 | 1 | 3 months ago | 8 | September 14, 2023 | 67 | mit | Go | ||
A next-generation crawling and spidering framework. | ||||||||||
Awesome Web Scraping | 6,060 | 5 months ago | 1 | other | Makefile | |||||
List of libraries, tools and APIs for web scraping and data processing. | ||||||||||
Awesome Crawler | 5,859 | 5 months ago | 27 | mit | ||||||
A collection of awesome web crawler,spider in different languages | ||||||||||
Autoscraper | 5,159 | 1 | a year ago | 16 | July 17, 2022 | 9 | mit | Python | ||
A Smart, Automatic, Fast and Lightweight Web Scraper for Python | ||||||||||
Douyin_tiktok_download_api | 4,844 | 5 months ago | 21 | September 23, 2023 | 60 | mit | Python | |||
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。 | ||||||||||
Rod | 4,505 | 140 | 4 months ago | 406 | November 06, 2023 | 106 | mit | Go | ||
A Devtools driver for web automation and scraping |