Awesome Open Source

Programming Languages

Search results for mongodb crawler

70 search results found

Crawlab ⭐ 10,521

Distributed web crawler admin platform for spiders management regardless of languages and frameworks. 分布式爬虫管理平台，支持任何语言和框架

Distribute_crawler ⭐ 3,176

使用scrapy,redis, mongodb,graphite实现的一个分布式网络爬虫,底层存储mongodb集群,分布式使用re

Lianjia Beike Spider ⭐ 2,464

链家网和贝壳网房价爬虫，采集北京上海广州深圳等21个中国主要城市的房价数据（小区，二手房，出租房，新 MongoDB,Excel, json存储，支持Python2和3，图表展示数据，注释丰富，点星支持，仅供学习参考，请勿用于商业用途，后果自负。

Anemone ⭐ 1,615

Anemone web-spider framework

Weixin Game Helper ⭐ 1,352

微信小游戏辅助合集（加减大师、包你懂我、大家来找茬腾讯版、头脑王者、好友画我、悦动音符、我最在行、星

Zhihu Crawler ⭐ 843

zhihu-crawler是一个基于Java的高性能、支持免费http代理池、支持横向扩展、分布式爬

Ppspider ⭐ 278

web spider built by puppeteer, support task-queue and task-scheduling by decorators，support nedb / mongodb, support data visualization; 基于puppeteer的web爬虫框架，提供灵活的任务队列管理调度方案，提供便捷的数据保存方案（ne

Taobao_bra_crawler ⭐ 189

a taobao web crawler just for fun.

Zhihu Crawler People ⭐ 179

A simple distributed crawler for zhihu && data analysis

Github_commit_crawler ⭐ 167

Tool used to continuously monitor a Github org for mistaken public commits

Scrapy_demo ⭐ 150

all kinds of scrapy demo

《数据采集从入门到放弃》源码。内容简介：爬虫介绍、就业情况、爬虫工程师面试题；HTTP协议介绍； Requests使用；解析器Xpath介绍； MongoDB与MySQL；多线程爬虫； Scrapy介绍；Scrapy-redis介绍；使用docker部署；使用nomad管理docker集群；使用EFK查询docker日志

Zhihu_crawler ⭐ 100

a crawler for zhihu

A distributed DHT crawler that sniffs torrents from BitTorrent network

Tornado Web Crawler

Douyin_spider ⭐ 59

🎨One simple and easy to use crawler for DouYin（一个简单易用的抖音爬虫,可下载指定用户,挑战,音乐的视频,音频和数据)

Theta Infrastructure Ledger Explorer ⭐ 57

Explorer for the Theta Ledger

Fishfishjump ⭐ 57

Fish Fish Jump is a solution in the python that simply and basic for search engines. 🐟 🐟 🐟

Dytt Reptitle ⭐ 53

🐜 Dytt crawler

Amazon S3 bucket finder and crawler.

Daily Code ⭐ 53

日常代码爬虫、gui小工具等

Zhihu Crawler ⭐ 52

徒手实现定时爬取知乎，从中发掘有价值的信息，并可视化爬取的数据作网页展示。项目目前正在开发，欢迎前来

Spiderman ⭐ 49

a crawler with visualized config board

Devsearch ⭐ 45

A web search engine built with Python which uses TF-IDF and PageRank to sort search results.

MongoDB extensions for Scrapy

Scrapy Admin ⭐ 40

A django admin site for scrapy

crawler data weibo & baidu & zhihu & newsmth & tianya & v2ex

Crawlerflow ⭐ 30

Web Crawlers orchestration framework that lets you create datasets from multiple web sources using yaml configurations.

Medicalkg ⭐ 26

医疗知识图谱构建实战，通过爬虫获取百度百科数据，使用Mongodb存储结构化三元组，并使用neo4j Medical Knowledge Graph; Crawler; neo4j

Animeapi ⭐ 25

☁️ A RESTful API for Anime Data.

Epicscrapy1024 ⭐ 25

BOOM💥BOOM💥BOOM💥!! Python3 + Scrapy + MongoDB . 5 million data and 10 gigabyte torrent file per day !!! 💥 The world's largest Chinese BBS.

爬虫简书网站 NodeJs MongoDB crawler

Autohome ⭐ 23

Using Scrapy to crawl Autohome, storage into MonogDB, simple analysis and NLP coming soon

Deadpool ⭐ 22

该项目是一个使用celery作为主体框架的爬虫应用，能够灵活的添加爬虫任务，并且同时运行多站点的爬虫

Ihealth_crawler ⭐ 21

iHealth 项目的内容爬虫（一个基于 python 和 MongoDB 的医疗咨询爬虫）

Webgenome ⭐ 21

A breadth first web crawler that stores HTTP headers in a MongoDB database with a web front end all written in Go.

Movierater ⭐ 20

A useful website for finding movie's rating in Chinese and English. By crawling Yahoo, Ptt, IMDB.

Lolcrawler ⭐ 19

Crawls League of Legends matches

Movie Scrapy ⭐ 19

时光网电影数据和海报爬虫

Stock_linebot_public ⭐ 17

The project for Linebot

crawl Google streetview images

The world's leading data crawler platform!

Crawler Douban Book ⭐ 16

NodeJs爬取豆瓣书籍的数据，并保存进MongoDB数据库。

Search Crawler ⭐ 16

Sample web crawler and search engine written in Node.JS and MongoDb

Newspaper Crawler ⭐ 15

Scrapy based crawler which crawls newspaper.

Spidey Mongo ⭐ 15

Implements a MongoDB back-end for Spidey (https://github.com/joeyAghion/spidey), a framework for crawling and scraping web sites.

Information Retrieval ⭐ 15

Elasticsearch, MongoDB, Tornado Server, RESTful API, Python, Information Retrieval, Machine Learning, Web Crawler

V2ex Crawler ⭐ 15

A simple single-threaded crawler for V2EX

Mongo Elasticsearch Nutch ⭐ 15

Docker image for creating a single Apache Nutch server, with mongodb as crawl storage and Elasticsearch for indexing

Ppspider_example ⭐ 14

ppspider爬虫例子，B站视频信息及评论爬取，qq音乐信息及评论爬取，推特主题评论和用户信息爬取

Scrapy Mongodb Queue ⭐ 13

Use scrapy with mongodb to store the request queues (FIFO or LIFO)

google app information ninja ,crawl google app information

Web crawler / web scraper micro-service for downloading advertisements from http://olx.ua

Wgit allows you to crawl and extract the data you want from the web

Bot Marvin ⭐ 9

Highly scalable crawler with best features.

Crawlerutils ⭐ 9

Utils for programming web crawler

API para recuperar informações sobre FII

Jobcrawler ⭐ 9

Scrapy Project For Crawling Job Information on 51Job. 基于Scrapy+Python3的51Job招聘信息爬虫

Nutch Mongo ⭐ 9

Dockerized Apache Nutch 2.3.1 configured for MongoDB

Playstore_crawler ⭐ 8

Python scalable Play Store crawler

Node Simplecrawler Queue Mongo ⭐ 8

MongoDB powered queue for Node Simple Crawler

Stocks Crawler ⭐ 8

Retrieve stocks data

Panda Bamboos Rank ⭐ 8

python crawl for panda bamboos rank

Githubcrawler ⭐ 8

分布式Github爬虫

Bangumispider ⭐ 8

对Bangumi.tv进行爬虫

Grapefruit Crawler ⭐ 8

Grapefruit 2.0 dht-crawler

Docker Nutch Elasticsearch Mongodb ⭐ 8

Docker Image for Apache Nutch, Elasticsearch and MongoDB

Node Crawler On Mongodb ⭐ 8

🕷 NodeJS + Puppeteer crawler on MongoDB

Disqus Crawler ⭐ 8

Crawl DISQUS comments from a blog into a local MongoDB database

Rubygems Crawler ⭐ 7

A little utility to download rubygems.org information - used for an ElasticSearch demo at RubyConf 2013

Scrapy_myanimelist ⭐ 7

Crawl anime, reviews and profiles from myAnimeList.net

Anemone_lite ⭐ 7

Distributed web crawler using mongodb

Pico Nova ⭐ 7

Python scraper / crawler for various torrent sites

Apple Store Crawler ⭐ 7

A simple, scalable scrapper for apps data being listed on Apple's AppStore

Simplified Search Engine ⭐ 7

Multithreaded Web Crawler, Scraper, Indexer

Shuoshuo_crawler ⭐ 7

A crawler for one of the most popular social network. Work correctly on Mac os with Xcode.

Crawl_3rd_party_stores ⭐ 6

An extensible crawler for downloading Android applications in the third-party markets.. 一个用于在第三方市场爬取并下载Android应用程序的爬虫（支持小米/360手机助手/应用宝/百度手

网易云爬虫，构建有价值的音乐排行榜！

Crawler Web Nodejs ⭐ 6

Web Crawler written in nodeJS and MongoDB.

This is a simple crawler implemented by Python 3.6

Information Crawler

Job Search Bot ⭐ 5

A Scrapy-based Python web crawler to notify users on a daily basis with up-to-date job postings.

Reddit Scraper ⭐ 5

Web scraper/crawler of Reddit page

Hkexnews_scrapy ⭐ 5

使用 Scrapy 拿滬港通及深港通持股紀錄

Backend, modern REST API for obtaining match and odds data crawled from multiple sites. Using FastAPI, MongoDB as database, Motor as async MongoDB client, Scrapy as crawler and Docker.

Coursewebcrawler ⭐ 5

web crawler for courses on scrapy

Stackoverflowgospider ⭐ 5

Using Golang to crawl data from Stackoverflow

Xueqiu_crawl ⭐ 5

fetch xueqiu users and their cubes

Web_crawler_0608 ⭐ 5

Parentaljobs ⭐ 5

Parents friendly jobs portal

Related Searches

Javascript Mongodb (19,125)

Express Mongodb (7,958)

Reactjs Mongodb (5,012)

Python Crawler (4,545)

Mongodb Mongoose (3,697)

Python Mongodb (3,143)

Mongodb Mongo (2,816)

Typescript Mongodb (2,411)

Docker Mongodb (2,092)

Java Mongodb (2,017)

1-70 of 70 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.