Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for webspider
webspider
x
342 search results found
Python Spider
⭐
16,816
🌈Python3网络爬虫实战:淘宝、京东、网易云、B站、12306、抖音、笔趣阁、漫画小说下载、音
Crawlee
⭐
12,158
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Crawlab
⭐
10,521
Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台,支持任何语言和框架
Pythonpark
⭐
8,310
Python 开源项目之「自学编程之路」,保姆级教程:AI实验室、宝藏视频、数据结构、学习指南、机器学习实战、深度
Spider Flow
⭐
8,075
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Katana
⭐
7,995
A next-generation crawling and spidering framework.
Awesome Crawler
⭐
5,859
A collection of awesome web crawler,spider in different languages
Proxypool
⭐
5,154
An Efficient ProxyPool with Getter, Tester and Server
Hakrawler
⭐
4,120
Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application
Generalnewsextractor
⭐
3,180
新闻网页正文通用抽取器 Beta 版.
Gerapy
⭐
3,144
Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js
Webcollector
⭐
2,974
WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
Nutch
⭐
2,742
Apache Nutch is an extensible and scalable web crawler
Gecco
⭐
2,403
Easy to use lightweight web crawler(易用的轻量化网络爬虫)
Gospider
⭐
2,190
Gospider - Fast web spider written in Go
Abot
⭐
1,991
Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.
Pspider
⭐
1,675
简单易用的Python爬虫框架,QQ交流群:597510560
Python3webspider
⭐
1,176
Source File of My Book related to WebSpider
Spider
⭐
907
A configurable web spider with a easy-to-use web console
Storm Crawler
⭐
834
A scalable, mature and versatile web crawler based on Apache Storm
Spidr
⭐
775
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Fetchbot
⭐
758
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Zhihu Spider
⭐
719
A web spider for zhihu.com
Marginaliasearch
⭐
711
Internet search engine for text-oriented websites. Indexing the small, old and weird web.
Browsertrix Crawler
⭐
470
Run a high-fidelity browser-based crawler in a single Docker container
Company Crawler
⭐
466
天眼查爬虫&企查查爬虫,指定关键字爬取公司信息
Spidersuite
⭐
447
Advance web spider/crawler for cyber security professionals
Wereadscan
⭐
447
扫描“微信读书”已购图书并下载本地PDF的爬虫
Ache
⭐
433
ACHE is a web crawler for domain-specific search.
Pulsarrpa
⭐
413
Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.
Dcrawl
⭐
411
Simple, but smart, multi-threaded web crawler for randomly gathering huge lists of unique domain names.
Sparkler
⭐
401
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Kochat
⭐
383
Opensource Korean chatbot framework
Learn To Identify Similar Images
⭐
358
Record my python script about Iearning to identify similar images
Python3webcrawler
⭐
347
🌈Python3网络爬虫实战:QQ音乐歌曲、京东商品信息、房天下、破解有道翻译、构建代理池、豆瓣读
Archivebot
⭐
328
ArchiveBot, an IRC bot for archiving websites
Supercrawler
⭐
324
A web crawler. Supercrawler automatically crawls websites. Define custom handlers to parse content. Obeys robots.txt, rate limits and concurrency limits.
Spidy
⭐
287
The simple, easy to use command line web crawler.
Crawler
⭐
285
Library for Rapid (Web) Crawler and Scraper Development
Gopa
⭐
281
[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn
Ant
⭐
271
A web crawler for Go
Technicalconceptsforinterviews
⭐
265
Various technical concepts for interviews - Feel free to contribute and make it better!
Lagoujob
⭐
250
Job data mining repo for lagou.com
Dark Fantasy Hack Tool
⭐
248
DDOS Tool: To take down small websites with HTTP FLOOD. Port scanner: To know the open ports of a site. FTP Password Cracker: To hack file system of websites.. Banner Grabber: To get the service or software running on a port. (After knowing the software running google for its vulnerabilities.) Web Spider: For gathering web application hacking information. Email scraper: To get all emails related to a webpage IMDB Rating: Easy way to access the movie database. Both .exe(compressed as zip) and .py
News Crawl
⭐
229
News crawling with StormCrawler - stores content as WARC
Infinitycrawler
⭐
221
A simple but powerful web crawler library for .NET
Crawler Commons
⭐
217
A set of reusable Java components that implement functionality common to any web crawler
Selenops
⭐
215
A Swift Web Crawler 🕷
Awesome Web Scraper
⭐
214
A collection of awesome web scaper, crawler.
Crawley
⭐
208
The unix-way web crawler
Strong Web Crawler
⭐
204
基于C#.NET+PhantomJS+Sellenium的高级网络爬虫程序。可执行Javascrip
Portia Dashboard
⭐
190
portia-dashboard is a visual web crawler based on scrapinghub/portia
Go Pkg Spider
⭐
190
一个 Golang 实现的相对智能、无需规则维护的通用新闻网站数据提取工具库。含域名探测、网页编码语种识别、网页链接分类
Ignareo Isml Auto Voter
⭐
186
Ignareo the Carillon, a web crawler/spider template of ultimate high concurrency built for leprechauns. Carillons as the best web spiders; Long live the golden years of leprechauns! (ISML=international saimoe; 2022 ISML is last ISML)
Spider Less
⭐
186
Web spider as a service, spider on serverless
Crawlab Lite
⭐
184
Lite version of Crawlab. 轻量版 Crawlab 爬虫管理平台
Digger
⭐
180
Digger is a powerful and flexible web crawler implemented by pure golang
Zhihu Crawler People
⭐
179
A simple distributed crawler for zhihu && data analysis
Antch
⭐
177
Antch, a fast, powerful and extensible web crawling & scraping framework for Go
Musicer
⭐
176
旨在将网易云、酷狗、QQ、酷我等各音乐平台集于一体
Crawler_shopee_public
⭐
169
蝦皮非同步爬蟲 + 競品賣家分析
Itsy
⭐
168
A threaded web-spider written in Clojure
Collector Http
⭐
162
Norconex Web Crawler (or spider) is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.
Netpwn
⭐
161
Tool made to automate tasks of pentesting.
Cocrawler
⭐
159
CoCrawler is a versatile web crawler built using modern tools and concurrency.
Google News Scraper
⭐
144
Lightweight scraper for Google News
Direct_web_spider
⭐
143
A direct web spider framworks for Ruby
Not Your Average Web Crawler
⭐
130
A web crawler (for bug hunting) that gathers more than you can imagine.
Ospider
⭐
124
开源矢量地理数据获取与预处理工具(POI/AOI/行政区/路网/土地利用)
Proxy
⭐
123
A simple tool for fetching usable proxies from several websites.
Dyer
⭐
118
Dyer is designed for reliable, flexible and fast web crawling, providing some high-level, comprehensive features without compromising speed.
Evine
⭐
117
Interactive CLI Web Crawler
Raspagem De Dados Para Iniciantes
⭐
115
Raspagem de dados para iniciante usando Scrapy e outras libs básicas
Crawlbox
⭐
112
Easy way to brute-force web directory.
Gflare Tk
⭐
110
Open-Source Python Based SEO Web Crawler
Abotx
⭐
106
Cross Platform C# Web crawler framework, headless browser, parallel crawler. Please star this project! +1.
Node Web Crawler
⭐
104
A web scraper with a web user interface which shows scraping stats in realtime. Uses Node.JS, jQuery, socket.io and Express.
Zhihu_crawler
⭐
100
a crawler for zhihu
Crawl Anywhere
⭐
98
Crawl-Anywhere - Web Crawler and document processing pipeline with Solr integration.
Krawler
⭐
96
A web crawling framework written in Kotlin
Polipus
⭐
95
Polipus: distributed and scalable web-crawler framework
Terpene Profile Parser For Cannabis Strains
⭐
93
Parser and database to index the terpene profile of different strains of Cannabis from online databases
Tacocat
⭐
86
A platform displaying the latest software engineer job information to entry-level new graduates
Webcrawler
⭐
86
Web crawler to download pictures from zhihu.com
Crabler
⭐
85
Web Crawler for Crabs
Bathyscaphe
⭐
83
Fast, highly configurable, cloud native dark web crawler.
Node Search Engine
⭐
79
Sample search engine with web crawler, built on Node.js + CouchDB + Limestone
Goodreadsscraper
⭐
76
Scrape data from Goodreads using Scrapy and Selenium 📚
Davedavefind
⭐
71
A simple search engine based on the web crawler developed in Udacity's CS101 course.
Tspider
⭐
71
Yet Another Web Spider
Webkitcrawler
⭐
69
QtWebKit-based web crawler
Schweizermesser
⭐
66
🎯Python 3 网络爬虫实战、数据分析合集 | 当当 | 网易云音乐 | unsplash | 必胜客 | 猫眼 |
Cvpr2019
⭐
65
Displays all the 2019 CVPR Accepted Papers in a way that they are easy to parse.
Amazon_scraper
⭐
64
Amazon products scraper with using of rotating proxies and headless Chrome from ScrapingAnt
Crawler
⭐
64
I needed a serious web crawler for search engine applications. This is it.
Simplestorm
⭐
62
Simple Storm-like distributed application implementation
Leek
⭐
61
Distributed task redisqueue(最简单python分布式函数调度框架)
Gocrawler
⭐
60
A distributed web crawler implemented using Go, Postgres, RabbitMQ and Docker
Keyword_based_sina_weibo_crawler
⭐
59
A web crawler for Sina, search and retrieve microblogs that contain certain keywords 一个简单的python爬虫实践,爬取包含关键词的新浪微博
Siteshooter
⭐
58
📷 Automate full website screenshots and PDF generation with multiple viewport support.
1-100 of 342 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.