Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
The Top 10 Webscraper Open Source Projects
Open source projects categorized as Webscraper
Categories
>
Webscraper
Edit Category
apify/crawlee
⭐
11,229
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
dependent packages
0
total releases
0
most recent commit
over 2 years ago
crawlab-team/crawlab
⭐
10,521
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
dependent packages
0
total releases
0
most recent commit
over 2 years ago
ssssssss-team/spider-flow
⭐
8,075
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
dependent packages
0
total releases
0
most recent commit
about 3 years ago
BruceDone/awesome-crawler
⭐
5,859
A collection of awesome web crawler,spider in different languages
dependent packages
0
total releases
0
most recent commit
over 2 years ago
alirezamika/autoscraper
⭐
5,159
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
dependent packages
0
total releases
0
most recent commit
about 3 years ago
hakluke/hakrawler
⭐
4,120
Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application
dependent packages
0
total releases
0
most recent commit
over 2 years ago
rchipka/node-osmosis
⭐
4,083
Web scraper for NodeJS
dependent packages
0
total releases
0
most recent commit
almost 3 years ago
php-curl-class/php-curl-class
⭐
3,208
PHP Curl Class makes it easy to send HTTP requests and integrate with web APIs
dependent packages
0
total releases
0
most recent commit
over 2 years ago
CrawlScript/WebCollector
⭐
2,974
WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
dependent packages
0
total releases
0
most recent commit
about 3 years ago
apache/nutch
⭐
2,742
Apache Nutch is an extensible and scalable web crawler
dependent packages
0
total releases
0
most recent commit
over 2 years ago
Get A Weekly Email With Trending Webscraper Projects
No Spam. Unsubscribe easily at any time.
Webscraper
Subscribe
Javascript must be enabled to subscribe.
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2026 Awesome Open Source. All rights reserved.