Awesome Open Source

Programming Languages

Search results for crawler spider

424 search results found

Colly ⭐ 21,902

Elegant Scraper and Crawler Framework for Golang

Easyspider ⭐ 20,149

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化

Proxy_pool ⭐ 19,442

Python ProxyPool for web spider

Pyspider ⭐ 15,943

A Powerful Spider(Web Crawler) System in Python.

Examples Of Web Crawlers ⭐ 13,142

一些非常有趣的python爬虫例子,对新手比较友好,主要爬取淘宝、天猫、微信、微信读书、豆瓣、QQ等 interesting examples of python crawlers that are friendly to beginners. )

Crawlab ⭐ 10,521

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Photon ⭐ 10,244

Incredibly fast crawler designed for OSINT.

Avbook ⭐ 8,777

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Spider Flow ⭐ 8,075

新一代爬虫平台，以图形化方式定义爬虫流程，不写代码即可完成爬虫。

Infospider ⭐ 6,856

INFO-SPIDER 是一个集众多数据源于一身的爬虫工具箱🧰，旨在安全快捷的帮助用户拿回自己的数据，工具代码开源，流程透

Node Crawler ⭐ 6,610

Web Crawler/Spider for NodeJS + server-side jQuery ;-)

Awesome Web Scraping ⭐ 6,060

List of libraries, tools and APIs for web scraping and data processing.

Awesome Crawler ⭐ 5,859

A collection of awesome web crawler,spider in different languages

Haipproxy ⭐ 5,384

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Douyin_tiktok_download_api ⭐ 4,844

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、T

Ecommercecrawlers ⭐ 3,724

实战🐍多种网站、电商数据爬虫🕷。包含🕸：淘宝商品、微信公众号、大众点评、企查查、招聘网站、闲鱼

Toapi ⭐ 3,417

Every web site provides APIs.

Novel Plus ⭐ 3,358

novel-plus 是一个多端（PC、WAP）阅读、功能完善的小说 CMS 系统。包括小说推荐、小说检索、小说排行、小说阅读、小说书架、小说评论、小说爬虫、会员中心、作家专区、

Browser Fingerprinting ⭐ 3,353

Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

Distribute_crawler ⭐ 3,176

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用re

Toapi ⭐ 3,153

Every web site provides APIs.

Gerapy ⭐ 3,144

Distributed Crawler Management Framework Based on Scrapy, Scrapyd, Django and Vue.js

Querylist ⭐ 2,598

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Python3 Spider ⭐ 2,582

Python爬虫实战 - 模拟登陆各大网站包含但不限于：滑块验证、拼多多、美团、百度、bilibili、大众点评、淘宝，如果喜欢请start ❤️

Lianjia Beike Spider ⭐ 2,464

链家网和贝壳网房价爬虫，采集北京上海广州深圳等21个中国主要城市的房价数据（小区，二手房，出租房，新 MongoDB,Excel, json存储，支持Python2和3，图表展示数据，注释丰富，点星支持，仅供学习参考，请勿用于商业用途，后果自负。

BitTorrent DHT Protocol && DHT Spider.

Decryptlogin ⭐ 2,375

DecryptLogin: APIs for loginning some websites by using requests.

Owllook ⭐ 2,340

owllook-小说搜索引擎

Torbot ⭐ 2,338

Dark Web OSINT Tool

Feapder ⭐ 2,333

🚀🚀🚀feapder is an easy to use, powerful crawler framework | feapder是一款上手简单，功能强大的Python爬虫框架。内置AirSpider、Spider、

Web Scraping Framework

Gospider ⭐ 2,190

Gospider - Fast web spider written in Go

Web crawling framework based on asyncio.

Cross Platform C# web crawler framework built for speed and flexibility. Please star this project! +1.

Web crawling framework based on asyncio.

Geziyor ⭐ 1,892

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

Crawler Detect ⭐ 1,842

🕷 CrawlerDetect is a PHP class for detecting bots/crawlers/spiders via the user agent

Async Python 3.6+ web scraping micro-framework based on asyncio

Pspider ⭐ 1,675

简单易用的Python爬虫框架，QQ交流群：597510560

Go_spider ⭐ 1,629

[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.

Anemone ⭐ 1,615

Anemone web-spider framework

Open Source Search Engine ⭐ 1,504

Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.

Xsscrapy ⭐ 1,398

XSS spider - 66/66 wavsep XSS detected

Catvodtvspider ⭐ 1,365

Php Spider ⭐ 1,316

A configurable and extensible PHP web spider

Catvodtvspider ⭐ 1,270

Grab Site ⭐ 1,254

The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns

Beanbun ⭐ 1,195

Beanbun 是用 PHP 编写的多进程网络爬虫框架，具有良好的开放性、高可扩展性，基于 Workerman。

Scrapy Cluster ⭐ 1,137

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

Crawler User Agents ⭐ 1,045

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome ⭐

🍻 bilibili video (including bangumi) and danmaku downloader | B站视频（含番剧）、弹幕下载器

磁力網站U3C3介紹以及域名更新

Querido Diario ⭐ 944

📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.

Python website crawler.

Xsrfprobe ⭐ 897

The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.

Crawler ⭐ 897

A high performance web crawler / scraper in Elixir.

Kimuraframework ⭐ 874

Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

Baiduspider ⭐ 872

BaiduSpider，一个爬取百度搜索结果的爬虫，目前支持百度网页搜索，百度图片搜索，百度知道搜索

Zhihu Crawler ⭐ 843

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬

Scrapyrt ⭐ 793

HTTP API for Scrapy spiders

Icrawler ⭐ 792

A multi-thread crawler framework with many builtin image crawlers provided.

Crawly, a high-level web crawling & scraping framework for Elixir.

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Baiduimagespider ⭐ 774

一个超级轻量的百度图片爬虫

Creeper ⭐ 762

🐾 Creeper - The Next Generation Crawler Framework (Go)

Spider_collection ⭐ 754

python爬虫，目前库存：网易云音乐歌曲爬取，B站视频爬取，知乎问答爬取，壁纸爬取，xvideos

X Crawl ⭐ 718

x-crawl is a flexible Node.js multifunctional crawler library. Flexible usage and numerous functions can help you quickly, safely, and stably crawl pages, interfaces, and files. ---------------- x-crawl 是一个灵活的 Node.js 多功能爬虫库。灵活的使用方式和众多的功能可以帮助您快速、安全、稳定地爬取页面、接口以及文件。

Device_detector ⭐ 711

DeviceDetector is a precise and fast user agent parser and device detector written in Ruby

Tweetscraper ⭐ 698

TweetScraper is a simple crawler/spider for Twitter Search without using API

Domain_hunter ⭐ 658

A Burp Suite Extension that try to find all sub-domain, similar-domain and related-domain of an organization automatically! 基于流量自动收集整个企业或组织的子域名、相似域名、相关域名的burp插件

Xxl Crawler ⭐ 650

A distributed web crawler framework.（分布式爬虫框架XXL-CRAWLER）

Hacker News Digest ⭐ 620

📰 Let ChatGPT Summarize Hacker News for You

Fictiondown ⭐ 601

小说下载|小说爬取|起点|笔趣阁|导出Markdown|导出txt|转换epub|广告过滤|自动校对

Newcrawler ⭐ 583

Free Web Scraping Tool with Java

Python Fxxk Spider ⭐ 571

收集各种免费的 Python 爬虫项目

OSINT Swiss Army Knife

Netdiscovery ⭐ 557

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

API of DouYin for Humans used to Crawl Popular Videos and Musics

Go_jobs ⭐ 536

带你了解一下Golang的市场行情

Spidermon ⭐ 486

Scrapy Extension for monitoring spiders execution.

Webster ⭐ 465

a reliable high-level web crawling & scraping framework for Node.js.

Tarantula ⭐ 453

a big hairy fuzzy spider that crawls your site, wreaking havoc

Awesome Scrapy ⭐ 450

A curated list of awesome packages, articles, and other cool resources from the Scrapy community.

Spidersuite ⭐ 447

Advance web spider/crawler for cyber security professionals

Crack Js Spider ⭐ 442

JS破解逆向，破解JS反爬虫加密参数，已破解极验滑块w（2022.2.19），QQ音乐sign（20

Learnpython ⭐ 437

Python的基础练习代码与各种爬虫代码

《爬虫逆向进阶实战》书籍代码库

The fastest web crawler written in Rust. Maintained by @a11ywatch.

Html2article ⭐ 425

Html网页正文提取

Fbcrawl ⭐ 415

A Facebook crawler

Linkedin Profile Scraper Api ⭐ 404

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Signature_algorithm ⭐ 387

各种App、小程序、网站的请求签名或加密算法。现已有：自如、小红书、蛋壳公寓、luckin coffee(瑞幸咖啡)、bangkokair(曼谷航空)

Scrapybook ⭐ 378

Scrapy Book Code

Ants Go ⭐ 368

open source, distributed, restful crawler engine in golang

Gospider ⭐ 354

golang实现的爬虫框架，使用者只需关心页面规则，提供web管理界面。基于colly开发。

Zhihu Login ⭐ 350

知乎模拟登录，支持提取验证码和保存 Cookies

91porn Api ⭐ 341

🌭💦 91porn爬虫在线无限制API接口（永久有效，口令每日更新）及在线web预览

Linkedindumper ⭐ 337

Python 3 script to dump/scrape/extract company employees from LinkedIn API

Free_proxy_website ⭐ 333

获取免费socks/https/http代理的网站集合

Freshonions Torscraper ⭐ 313

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion

Related Searches

Python Crawler (4,528)

Python Spider (2,155)

Javascript Crawler (1,142)

Spider Scrapy (982)

Scraper Crawler (896)

Java Crawler (593)

Crawler Scrapy (578)

Golang Crawler (509)

1-100 of 424 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.