Awesome Open Source

Programming Languages

Search results for redis crawler

84 search results found

Proxy_pool ⭐ 19,442

Python ProxyPool for web spider

Pyspider ⭐ 15,943

A Powerful Spider(Web Crawler) System in Python.

Crawlab ⭐ 10,521

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Scrapy Redis ⭐ 5,438

Redis-based components for Scrapy.

Haipproxy ⭐ 5,384

💖 High available distributed ip proxy pool, powerd by Scrapy and Redis

Proxypool ⭐ 5,154

An Efficient ProxyPool with Getter, Tester and Server

Distribute_crawler ⭐ 3,176

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用re

Anemone ⭐ 1,615

Anemone web-spider framework

Scrapy Cluster ⭐ 1,137

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

Listed Company News Crawl And Text Analysis ⭐ 689

从新浪财经、每经网、金融界、中国证券网、证券时报网上，爬取上市公司（个股）的历史新闻文本数据进行文本

Magnet Dht ⭐ 591

✌️ Python3 BitTorrent DHT crawler

Netdiscovery ⭐ 557

NetDiscovery 是一款基于 Vert.x、RxJava 2 等框架实现的通用爬虫框架/中间件。

Go_jobs ⭐ 536

带你了解一下Golang的市场行情

Go Movies ⭐ 232

golang spider Crawler 爬虫电影

Awesome_crawl ⭐ 206

腾讯新闻、知乎话题、微博粉丝，Tumblr爬虫、斗鱼弹幕、妹子图爬虫、分布式设计等

Zhihu Crawler People ⭐ 179

A simple distributed crawler for zhihu && data analysis

🥄 A package for building specific Proxy Pool for different Sites.

Weibosearch ⭐ 144

A distributed Sina Weibo Search spider base on Scrapy and Redis.

POOPAK - TOR Hidden Service Crawler

《数据采集从入门到放弃》源码。内容简介：爬虫介绍、就业情况、爬虫工程师面试题；HTTP协议介绍； Requests使用；解析器Xpath介绍； MongoDB与MySQL；多线程爬虫； Scrapy介绍；Scrapy-redis介绍；使用docker部署；使用nomad管理docker集群；使用EFK查询docker日志

Zhihu_crawler ⭐ 100

a crawler for zhihu

Polipus: distributed and scalable web-crawler framework

Scrapyd Cluster On Heroku ⭐ 90

Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO 👉

Baidu_tieba_crawler ⭐ 90

技术栈vue cheerio socket mongodb redis 在线演示地址

Lcbo Api ⭐ 85

A crawler and API server for Liquor Control Board of Ontario retail data

Zhihu_spider ⭐ 79

large-scale user information crawler of zhihu

A DHT crawler and torrent indexer

Zhihu Scrapy ⭐ 79

A scrapy zhihu crawler

Awesome Python Primer ⭐ 78

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Tornado Web Crawler

Collyzar ⭐ 65

Distributed redis-based web crawler framework for colly

Cross-platform persistent and distributed web crawler 🦀

Fishfishjump ⭐ 57

Fish Fish Jump is a solution in the python that simply and basic for search engines. 🐟 🐟 🐟

Bing Wallpaper Action ⭐ 55

API with Redis / Vercel , DataBase with Json, Crawel with Github Actions . Product: https://github.com/zkeq/Bing-Wallpaper-Action/tree

Zhihu Crawler ⭐ 52

徒手实现定时爬取知乎，从中发掘有价值的信息，并可视化爬取的数据作网页展示。项目目前正在开发，欢迎前来

Ugly Distributed Crawler ⭐ 43

基于Redis实现的简单到爆的分布式爬虫

Novel Online ⭐ 40

小说在线转码工具 ----- 给你一个无广告的极简阅读体验。

Scrapy Distributed ⭐ 40

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Rust Web Crawler saving pages on Redis

Slinky, a high-performance web crawler / text analytics in Python, Redis, Hadoop, R, Gephi

Go Crawler Distributed ⭐ 39

分布式爬虫项目，本项目支持个性化定制页面解析器二次开发，项目整体采用微服务架构，通过消息队列实现消息 gorm, goquery, easyjson, viper, amqp, zap, go-micro，并通过Docker实现容器化部署，中间爬虫节点支持水平拓展。

🔥 Golang basics and actual-combat (including: crawler, distributed-systems, data-analysis, redis, etcd, raft, crontab-task)

crawler data weibo & baidu & zhihu & newsmth & tianya & v2ex

Scrapy Kafka Redis ⭐ 35

Distributed crawling/scraping, Kafka And Redis based components for Scrapy

Laravel Block Bots ⭐ 34

Block crawlers and high traffic users on your site by IP using Redis

crawl pages to check what is for lunch today

Nodejs Examples ⭐ 33

Example Node.js projects

Chromium / Puppeteer site crawler

Generate an object for testing if a request is sent, request is Mikeal's request.

Yurun Crawler ⭐ 28

宇润爬虫框架(Yurun Crawler) 是一个低代码、高性能、分布式爬虫采集框架，基于 imi 框架开发，运行在 Swoole 常驻内存的协程环境。

Universityrecruitment Ssurvey ⭐ 28

用严肃的数据来回答“什么样的企业会到什么样的大学招聘”？

Node_tumblr_spider ⭐ 26

A crawler written by Node.js to download tumblr videos.

Scrapy_pro ⭐ 24

关于5000+站点的scrapy爬虫开发，涉及一些技术架构搭建以及各种反爬方案，详见readme文件

G Crawl Py ⭐ 24

Gevent Crawling in Python, with Utilities

Crawl_pornhub ⭐ 23

crawling pornhub

Gecco Redis ⭐ 23

Gecko crawler supports distributed by redis

Deadpool ⭐ 22

该项目是一个使用celery作为主体框架的爬虫应用，能够灵活的添加爬虫任务，并且同时运行多站点的爬虫

Bthello App ⭐ 21

Python3 DHT 磁力种子爬虫种子解析种子搜索演示地址

Movierater ⭐ 20

A useful website for finding movie's rating in Chinese and English. By crawling Yahoo, Ptt, IMDB.

A simple, scalable, and highly efficient web crawler framework for Java.

Pubcrawl ⭐ 18

*Deprecated* A short and sweet Python web crawler using Redis as the process queue, seen set and Memcache style rate limiter for robots.txt

The world's leading data crawler platform!

Proxypool ⭐ 15

A python async proxy crawler and proxy pool.

scrapy-redis代码研究

Arachnod ⭐ 14

High performance crawler for Nodejs

Python3 DHT 磁力种子爬虫种子解析种子搜索演示地址

Robots.txt ⭐ 13

🤖 robots.txt as a service. Crawls robots.txt files, downloads and parses them to check rules through an API

Springbootdc ⭐ 13

SpringBoot Developer Components

Pubsenti Finder ⭐ 12

微博评论情感分析，爬虫，文本分类，Web。

Ciao Ssr ⭐ 12

A server side render service based on puppeteer

Crawler Framework ⭐ 12

分布式爬虫框架,基于webdrvier模拟用户请求,kafka消息传递,分布式网页存储使用hbase ip服务和号码验证服务等, proxy page使用H5和we版进行接入

Soundclouder ⭐ 11

🎹 Crawling SoundCloud at scale (Go + Redis)

Scrapy Blog Crawler ⭐ 10

Crawl a blog url, and find all url from it, then save to mysql.

Crawling_the_web ⭐ 10

《虫术:Python绝技》随书源码

Proxypool ⭐ 9

A ProxyPool based on Scrapy and Redis(基于Scrapy和Redis的代理池)

函数流，数据流，同步异步，事件驱动，多种模式channel

Magnet Crawler ⭐ 8

一个磁力链接的爬虫。

Airspider ⭐ 8

A Fast and Light Python Spider Framework 🕷️

Creepy Crawly ⭐ 8

A website crawler for node written in coffee-script

Dockerized Image Crawler ⭐ 8

An example image crawler app using fig+Docker, Twisted, Redis, and ZeroMQ networked together

Githubcrawler ⭐ 8

分布式Github爬虫

Python_scrapy_bank ⭐ 7

全国银行联行号

Crawler_weibo ⭐ 7

Python 抓取新浪微博m站微博信息

Anemone_lite ⭐ 7

Distributed web crawler using mongodb

Xiaohongshu Spider Visualizer ⭐ 7

A distributed web crawler for xiaohongshu.com and visualization for the crawled content.

Python Lcv Search Engine ⭐ 7

Updated version of Python distributed crawler- A search engine. It serves as the Google Chrome web browser as its principal user interface.

网易云爬虫，构建有价值的音乐排行榜！

Knowledge Graph Demo ⭐ 6

知识图谱的小demo

Just a typical search engine in this universe 🔥🔥🔥

Scrapy Redis ⭐ 6

compatible with scrapy 0.17

Progressive Web App Crawler

Bilibilispider ⭐ 6

To crawl video information from bilibili damaku site and do some analysis.

Hotel Spider ⭐ 6

简单分布式爬虫系统--XC酒店数据

基于django和scrapy的采集系统

Skeleton X ⭐ 6

🎉基于Springboot的SSM脚手架,目前已整合spring-scurity,websocke

Tiny Garbage3 ⭐ 5

A tiny FTP crawler and indexer with a web UI, based on Redis and designed for performance

Spidertaobao ⭐ 5

spider taobao suggest

Eastmoney ⭐ 5

a crawler by scrapy of eastmoney.com

Scrapy Cluster ⭐ 5

modified by http://github.com/istresearch/scrapy-cluster.git to make it appropriate for project.

Related Searches

Javascript Redis (4,842)

Python Crawler (4,545)

Java Redis (3,565)

Python Redis (3,473)

Docker Redis (2,405)

Mysql Redis (2,199)

Golang Redis (2,049)

Php Redis (1,915)

Redis Spring (1,500)

Spring Boot Redis (1,280)

1-84 of 84 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.