Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for java crawler
crawler
x
java
x
186 search results found
Webmagic
⭐
11,080
A scalable web crawler framework for Java.
Spider Flow
⭐
8,075
新一代爬虫平台,以图形化方式定义爬虫流程,不写代码即可完成爬虫。
Crawler4j
⭐
4,192
Open Source Web Crawler for Java
Novel Plus
⭐
3,358
novel-plus 是一个多端(PC、WAP)阅读 、功能完善的小说 CMS 系统。包括小说推荐、小说检索、小说排行、小说阅读、小说书架、小说评论、小说爬虫、会员中心、作家专区、
Webcollector
⭐
2,974
WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
Nutch
⭐
2,742
Apache Nutch is an extensible and scalable web crawler
Heritrix3
⭐
2,579
Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project.
Gecco
⭐
2,403
Easy to use lightweight web crawler(易用的轻量化网络爬虫)
Spring Boot Quick
⭐
2,282
🌿 基于springboot的快速学习示例,整合自己遇到的开源框架,如:rabbitmq(延迟队列)、K
Seimicrawler
⭐
1,895
一个简单、敏捷、分布式的支持SpringBoot的Java爬虫框架;An agile, distributed crawler framework.
Fscrawler
⭐
1,279
Elasticsearch File System Crawler (FS Crawler)
Catvodtvspider
⭐
1,270
Newpipeextractor
⭐
1,070
NewPipe's core library for extracting data from streaming sites
Fess
⭐
943
Fess is very powerful and easily deployable Enterprise Search Server.
Zhihu Crawler
⭐
843
zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬
Storm Crawler
⭐
834
A scalable, mature and versatile web crawler based on Apache Storm
Computerstudent
⭐
764
计算机专业系统性学习资料(python,c,c++,计算机组成,计算机网络,编译原理,电路,谷歌插件
Domain_hunter
⭐
658
A Burp Suite Extension that try to find all sub-domain, similar-domain and related-domain of an organization automatically! 基于流量自动收集整个企业或组织的子域名、相似域名、相关域名的burp插件
Xxl Crawler
⭐
650
A distributed web crawler framework.(分布式爬虫框架XXL-CRAWLER)
Yacy_grid_crawler
⭐
639
Crawler Microservice for the YaCy Grid
Netdiscovery
⭐
557
NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。
Jvppeteer
⭐
549
Headless Chrome For Java (Java 爬虫)
Google Play Crawler
⭐
539
Play with Google Play API :)
Go_jobs
⭐
536
带你了解一下Golang的市场行情
Crawljax
⭐
493
Crawljax
Lxbook
⭐
426
《爬虫逆向进阶实战》书籍代码库
Opensearchserver
⭐
419
Open-source Enterprise Grade Search Engine Software
Sparkler
⭐
401
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Crawlerforreader
⭐
293
Android 本地网络小说爬虫,基于jsoup及xpath
Weibopicdownloader
⭐
248
免登录下载微博图片 爬虫 Download Weibo Images without Logging-in
Elasticsearch River Web
⭐
232
Web Crawler for Elasticsearch
News Crawl
⭐
229
News crawling with StormCrawler - stores content as WARC
Crawler Commons
⭐
217
A set of reusable Java components that implement functionality common to any web crawler
Commoncrawl Crawler
⭐
208
The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)
One
⭐
201
Use MVP+Dagger2+Realm as a major infrastructure of project and data get from One and Sujin with Crawler.
Webvideobot
⭐
200
Web crawler.
Web Bee
⭐
186
🐝 Web vertical crawler framework for fun
Awesome Java Crawler
⭐
172
本仓库收集整理爬虫相关资源,开发语言以Java为主
Fxdesktopsearch
⭐
168
A JavaFX based desktop search application.
Collector Http
⭐
162
Norconex Web Crawler (or spider) is a flexible web crawler for collecting, parsing, and manipulating data from the Internet (or Intranet) to various data repositories such as search engines.
Jlitespider
⭐
151
A lite distributed Java spider framework :-)
Akka Crawler Example
⭐
124
Some example code of using Akka from Java
Nutch Htmlunit
⭐
122
基于Apache Nutch和Htmlunit的扩展实现AJAX页面爬虫抓取解析插件
Prerender Java
⭐
120
java framework for prerender
Multithreading Crawlers
⭐
110
多线程爬虫--抓取淘宝商品详情页URL
Ghs
⭐
106
GitHub Search: Platform used to crawl, store and present projects from GitHub, as well as any statistics related to them.
Crawlerpack
⭐
99
Java 網路資料爬蟲包
Proxy Pool
⭐
87
爬虫代理IP池服务,可供其他爬虫程序通过restapi获取
Cc Index Table
⭐
78
Index Common Crawl archives in tabular format
Movie Elasticsearch
⭐
76
使用 SpringBoot2.0+ElasticSearch 实现的开源电影搜索引擎
Common Crawl
⭐
73
playing around with the common crawl dataset
Venom
⭐
73
Your preferred open source focused crawler for the deep web.
Dodder
⭐
71
A distributed DHT crawler that sniffs torrents from BitTorrent network
Customer Review Crawler
⭐
69
A crawler to collect reviews and product infomation on Amazon.com
Bubing
⭐
68
The LAW next generation crawler.
Hot Crawler
⭐
66
A springboot-based hot news crawler.
Crawler
⭐
64
Simple java web crawler
Slime
⭐
63
🍰 A visual crawler management platform
Jewelcrawler
⭐
62
豆瓣电影爬虫——a crawler which is able to crawl movie detail and short comments, save them to database mysql, also include Sentiment analysis based on comments
Woothee Java
⭐
56
Woothee Java implementation and Hive UDF
Piccrawler
⭐
53
使用RxJava2 和 Java 8的特性开发的图片爬虫
Baidu Chain Dog
⭐
52
百度莱茨狗爬虫。
Lezhin Comics Downloader
⭐
48
📥 Downloader for lezhin comics
Jscrawler
⭐
48
A library for dynamic update crawler script on android app
Cc Warc Examples
⭐
46
CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop
Twitlogic
⭐
45
Real-time #SemanticWeb in <= 140 chars
Smartmovingreloaded
⭐
45
A Minecraft mod that provides additional movement actions such as crawling and climbing.
Camus
⭐
44
experimental project for crawling articles from a user's twitter feed and re-arranging them in terms of readability attributes
Cc Webgraph
⭐
44
Tools to construct and process webgraphs from Common Crawl data
Example Warc Java
⭐
43
Wpcrawler
⭐
43
a web crawler for single WordPress site
Avro Schema Generator
⭐
43
Library for generating avro schema files (.avsc) based on DB tables structure
Quick Crawler
⭐
43
java crawler framework
Burp Dom Scanner
⭐
42
Burp Suite's extension to scan and crawl Single Page Applications
Areweprivateyet
⭐
40
The crawler/analysis component of Are We Private Yet?
Wikireverse
⭐
39
Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.
Facebookcrawler
⭐
39
Facebook Crawler - Crawl information from facebook
Vscrawler
⭐
39
a crawler framework appropriate grab
Xultimate Toolkit
⭐
38
A JavaEE application reference architecture based Spring Framework.
Neteasecloudmusiccrawler
⭐
38
HttpClient + Jsoup + Queue
Java Carwler Technology
⭐
36
网络数据采集技术—Java网络爬虫 (书稿完整代码,涉及网络爬虫的各种技术和知识点)
Cetty
⭐
36
基于事件分发的爬虫框架
Nintendoswitcheshophelper
⭐
36
DEPRECATED! Please move to https://github.com/Dounx/gwitch
Mmdownloader
⭐
36
마루마루 다운로더 신규 프로젝트
Zeekeye
⭐
36
:octocat:A Fast and Powerful Scraping and Web Crawling Framework.
Flink Crawler
⭐
35
Continuous scalable web crawler built on top of Flink and crawler-commons
Url Frontier
⭐
34
API definition, resources and reference implementation of URL Frontiers
Dungeon Crawl Android
⭐
33
Dungeon Crawl: Stone Soup for Android (console version)
Burp Csj
⭐
32
BurpCSJ extension for Burp Pro - Crawljax Selenium JUnit integration
Docdex
⭐
32
JSON API & Discord Bot for Javadocs
Crawler4j
⭐
30
Open Source Simple Web Crawler for Java. Simple Flexible And Lightweight
Ldspider
⭐
30
A crawler for the Linked Data web
Stormscraper
⭐
29
A Storm based web crawler with Cassandra backend
Toutiaocrawler
⭐
29
头条号爬虫案例
Visual Spider
⭐
29
用JavaFX开发基于crawler4j的图形化的网络爬虫
Wx Crawl
⭐
28
微信公众号文章爬虫
Search_ads_web_service
⭐
27
Online search advertisement platform & Realtime Campaign Monitoring [Maybe Deprecated]
Nutch Selenium
⭐
27
Webcrawler Verifier
⭐
27
Java library providing functionality to verify that user-agents are who they claim to be.
Crawler Denfender
⭐
26
反网页爬虫系统
Related Searches
Java Spring (21,350)
Java Plugin (12,452)
Java Spring Boot (11,982)
Java Video Game (8,093)
Java Gradle (8,072)
Java Docker (6,180)
Java Database (6,015)
Java Mysql (5,954)
Java Sdk (5,864)
Javascript Java (5,468)
1-100 of 186 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.