Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for crawler
crawler
x
3,834 search results found
Scrapely
⭐
1,668
A pure-python HTML screen-scraping library
Go_spider
⭐
1,629
[爬虫框架 (golang)] An awesome Go concurrent Crawler(spider) framework. The crawler is flexible and modular. It can be expanded to an Individualized crawler easily or you can use the default crawl components only.
React Snapshot
⭐
1,619
A zero-configuration static pre-renderer for React apps
Anemone
⭐
1,615
Anemone web-spider framework
Python Crawler
⭐
1,576
从头开始 系统化的 学习如何写Python爬虫。 Python版本 3.6
Static Site Generator Webpack Plugin
⭐
1,538
Minimal, unopinionated static site generator powered by webpack
Open Source Search Engine
⭐
1,504
Nov 20 2017 -- A distributed open source search engine and spider/crawler written in C/C++ for Linux on Intel/AMD. From gigablast dot com, which has binaries for download. See the README.md file at the very bottom of this page for instructions.
Autocrawler
⭐
1,454
Google, Naver multiprocess image web crawler (Selenium)
Node Rate Limiter
⭐
1,444
A generic rate limiter for node.js. Useful for API clients, web crawling, or other tasks that need to be throttled
Bilix
⭐
1,433
⚡️Lightning-fast async download tool for bilibili and more | 快如闪电的异步下载工具,支持bilibili及更多
Xsscrapy
⭐
1,398
XSS spider - 66/66 wavsep XSS detected
Diskover Community
⭐
1,391
Diskover Community Edition - Open source file indexer, file search engine and data management and analytics powered by Elasticsearch
Catvodtvspider
⭐
1,365
Lightcrawler
⭐
1,354
Crawl a website and run it through Google lighthouse
Weixin Game Helper
⭐
1,352
微信小游戏辅助合集(加减大师、包你懂我、大家来找茬腾讯版、头脑王者、好友画我、悦动音符、我最在行、星
Swiftlinkpreview
⭐
1,347
It makes a preview from an URL, grabbing all the information such as title, relevant texts and images.
Php Spider
⭐
1,316
A configurable and extensible PHP web spider
Openwpm
⭐
1,307
A web privacy measurement framework
Ast Hook For Js Re
⭐
1,303
浏览器内存漫游解决方案(探索中...)
Wombat
⭐
1,297
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Article Extractor
⭐
1,297
To extract main article from given URL with Node.js
Jd Autobuy
⭐
1,292
Python爬虫,京东自动登录,在线抢购商品
Core
⭐
1,290
The complete web scraping toolkit for PHP.
Sotawhat
⭐
1,280
Returns latest research results by crawling arxiv papers and summarizing abstracts. Helps you stay afloat with so many new papers everyday.
Fscrawler
⭐
1,279
Elasticsearch File System Crawler (FS Crawler)
Catvodtvspider
⭐
1,270
Lxspider
⭐
1,267
爬虫案例合集。包括但不限于《淘宝、京东、天猫、豆瓣、抖音、快手、微博、微信、阿里、头条、pdd、优酷
Grab Site
⭐
1,254
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
Frontera
⭐
1,244
A scalable frontier for web crawlers
Cariddi
⭐
1,228
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
Beanbun
⭐
1,195
Beanbun 是用 PHP 编写的多进程网络爬虫框架,具有良好的开放性、高可扩展性,基于 Workerman。
Fdir
⭐
1,191
⚡ The fastest directory crawler & globbing library for NodeJS. Crawls 1m files in < 1s
Lightnovel Crawler
⭐
1,185
Generate and download e-books from online sources.
Scrapy Cluster
⭐
1,137
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Sqliv
⭐
1,111
massive SQL injection vulnerability scanner
Tumblr Crawler
⭐
1,105
Easily download all the photos/videos from tumblr blogs. 下载指定的 Tumblr 博客中的图片,视频
Angular Seo
⭐
1,082
SEO for AngularJS apps made easy.
Newpipeextractor
⭐
1,070
NewPipe's core library for extracting data from streaming sites
Parliament Scraper
⭐
1,049
Public Data Scraper for Parliament Data for the EU and other Parliaments
Crawler User Agents
⭐
1,045
Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome ⭐
Appcrawler
⭐
1,023
基于appium的app自动遍历工具
Holiday Cn
⭐
1,018
📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告
Instagram Profilecrawl
⭐
1,001
📝 quickly crawl the information (e.g. followers, tags etc...) of an instagram profile.
Bilili
⭐
999
🍻 bilibili video (including bangumi) and danmaku downloader | B站视频(含番剧)、弹幕下载器
Bt Btt
⭐
991
磁力網站U3C3介紹以及域名更新
Pxer
⭐
967
A tool for pixiv.net. 人人可用的P站爬虫
Dungeonfs
⭐
966
A FUSE filesystem and dungeon crawling adventure game engine
Python Seo Analyzer
⭐
956
An SEO tool that analyzes the structure of a site, crawls the site, count words in the body of the site and warns of any technical SEO issues.
Linkinator
⭐
955
🐿 Scurry around your site and find all those broken links.
Querido Diario
⭐
944
📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.
Fess
⭐
943
Fess is very powerful and easily deployable Enterprise Search Server.
Pxer
⭐
937
A tool for pixiv.net. 人人可用的P站爬虫
Mlscraper
⭐
935
🤖 Scrape data from HTML websites automatically by just providing examples
Tumblthree
⭐
922
A Tumblr Blog Backup Application
Instagram Crawler
⭐
922
Get Instagram posts/profile/hashtag data without using Instagram API
Spider
⭐
919
Python website crawler.
Prerender Node
⭐
916
Express middleware for prerendering javascript-rendered pages on the fly for SEO
Crawlergo_x_xray
⭐
915
360/0Kee-Team/crawlergo动态爬虫结合长亭XRAY扫描器的被动扫描功能
Goclone
⭐
907
Website Cloner - Utilizes powerful Go routines to clone websites to your computer within seconds.
Bhban_rpa
⭐
903
<6개월 치 업무를 하루 만에 끝내는 업무 자동화(생능출판사, 2020)>의 예제 코드입니다. 파이썬을 한 번도 배워본 적 없는 분들을 위한 예제이며, 엑셀부터 디자인, 매크로, 크롤링까지 업무 자동화와 관련된 다양한 분야 예제가 제공됩니다.
Xsrfprobe
⭐
897
The Prime Cross Site Request Forgery (CSRF) Audit and Exploitation Toolkit.
Crawler
⭐
897
A high performance web crawler / scraper in Elixir.
Awesome Datahoarding
⭐
892
List of data-hoarding related tools
Scrawler
⭐
882
🏳️🌈 Media downloader from any sites, including Twitter, Reddit, Instagram, Threads, Facebook, OnlyFans, YouTube, Pinterest, PornHub, XHamster, XVIDEOS, ThisVid etc.
Kimuraframework
⭐
874
Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
Baiduspider
⭐
872
BaiduSpider,一个爬取百度搜索结果的爬虫,目前支持百度网页搜索,百度图片搜索,百度知道搜索
Angrysearch
⭐
866
Linux file search, instant results as you type
Mzitu
⭐
853
👧 美女写真套图爬虫(二)
Zhihu Crawler
⭐
843
zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬
Scrapy Selenium
⭐
842
Scrapy middleware to handle javascript pages using selenium
Storm Crawler
⭐
834
A scalable, mature and versatile web crawler based on Apache Storm
Scrapyrt
⭐
793
HTTP API for Scrapy spiders
Icrawler
⭐
792
A multi-thread crawler framework with many builtin image crawlers provided.
Crawly
⭐
790
Crawly, a high-level web crawling & scraping framework for Elixir.
Ipfs Search
⭐
779
Search engine for the Interplanetary Filesystem.
Seccrawler
⭐
777
一个方便安全研究人员获取每日安全日报的爬虫和推送程序,目前爬取范围包括先知社区、安全客、Seebug Paper、跳跳糖、奇安信攻防社区、棱角社区以及绿盟、腾讯玄武、天融信、360等实验室博客,持续更新
Spidr
⭐
775
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
Baiduimagespider
⭐
774
一个超级轻量的百度图片爬虫
Till
⭐
770
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
Computerstudent
⭐
764
计算机专业系统性学习资料(python,c,c++,计算机组成,计算机网络,编译原理,电路,谷歌插件
Creeper
⭐
762
🐾 Creeper - The Next Generation Crawler Framework (Go)
Nginx Badbot Blocker
⭐
759
Block bad, possibly even malicious web crawlers (automated bots) using Nginx
Fetchbot
⭐
758
A simple and flexible web crawler that follows the robots.txt policies and crawl delays.
Spider_collection
⭐
754
python爬虫,目前库存:网易云音乐歌曲爬取,B站视频爬取,知乎问答爬取,壁纸爬取,xvideos
Lulu
⭐
752
[Unmaintained] A simple and clean video/music/image downloader 👾
X Crawl
⭐
718
x-crawl is a flexible Node.js multifunctional crawler library. Flexible usage and numerous functions can help you quickly, safely, and stably crawl pages, interfaces, and files. ---------------- x-crawl 是一个灵活的 Node.js 多功能爬虫库。灵活的使用方式和众多的功能可以帮助您快速、安全、稳定地爬取页面、接口以及文件。
Skrape.it
⭐
714
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.
Device_detector
⭐
711
DeviceDetector is a precise and fast user agent parser and device detector written in Ruby
Packtpub Crawler
⭐
701
Download your daily free Packt Publishing eBook https://www.packtpub.com/packt/offers/free-learnin
Bookcorpus
⭐
698
Crawl BookCorpus
Tweetscraper
⭐
698
TweetScraper is a simple crawler/spider for Twitter Search without using API
Listed Company News Crawl And Text Analysis
⭐
689
从新浪财经、每经网、金融界、中国证券网、证券时报网上,爬取上市公司(个股)的历史新闻文本数据进行文本
Pyptt
⭐
685
直接連線登入的 PTT library,支援 PTT, PTT2
Catgate
⭐
681
CatGate is a small crawler framework based on Chrome extension . CatGate是一个基于浏览器插件的数据抓取工具。做成浏览器插件无需模拟登入,能最真实的模仿用户行为
Go Dork
⭐
677
The fastest dork scanner written in Go.
Staticgen
⭐
668
Static website generator that lets you use HTTP servers and frameworks you already know
Domain_hunter
⭐
658
A Burp Suite Extension that try to find all sub-domain, similar-domain and related-domain of an organization automatically! 基于流量自动收集整个企业或组织的子域名、相似域名、相关域名的burp插件
One Python
⭐
655
We don't need a lot of libraries. We just need the best ones. | Unofficial recommended first choice.
Word2vec Graph
⭐
650
Exploring word2vec embeddings as a graph of nearest neighbors
Xxl Crawler
⭐
650
A distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Related Searches
Python Crawler (4,545)
Javascript Crawler (1,142)
Crawler Scrapy (988)
Scraper Crawler (896)
Java Crawler (807)
Crawler Spider (709)
101-200 of 3,834 search results
< Previous
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.