Crawling Infrastructure Alternatives

Name: NikolaiT/Crawling-Infrastructure
Brand: NikolaiT/Crawling-Infrastructure
SKU: project/NikolaiT/Crawling-Infrastructure
Rating: 4.54 (321 reviews)

Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.

Categories > Data Processing > Amazon Web Services

Suggest Alternative

Stars

321

Alternatives

License

agpl-3.0

Open Issues

Most Recent Commit

over 4 years ago

Programming Language

TypeScript

Dependent Repos

Dependent Packages

Total Releases

Categories

Programming Languages > Typescript

Cloud Computing > Amazon Web Services

Cloud Computing > Cloud Computing

Web Browsers > Puppeteer

Site

Repo

Alternatives To NikolaiT/Crawling-Infrastructure

Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
NikolaiT/Crawling-Infrastructure	321	0	0	over 4 years ago	0		22	agpl-3.0	TypeScript
Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.
stopstalk/stopstalk-deployment	306	0	0	over 2 years ago	0		92	mit	Python
Stop stalking and start StopStalking :wink:
commoncrawl/cc-pyspark	280	0	0	over 3 years ago	0		4	mit	Python
Process Common Crawl data with Python and Spark
intoli/intoli-article-materials	255	0	0	over 3 years ago	0		85	other	JavaScript
All of the supporting materials for articles from Intoli's blog.
trek10inc/awsets	184	0	1	over 3 years ago	35	May 19, 2022	6	mit	Go
A utility for crawling an AWS account and exporting all its resources for further analysis.
MarcelloLins/ServerlessCrawler-VancouverRealState	66	0	0	almost 9 years ago	0		1	mit	Python
A Serverless Crawler For Real State Data in Vancouver Using AWS Lambda, Dynamo, RDS MySQL and CloudWatch
LeiShi1313/serverless-web-differ	60	0	0	almost 4 years ago	0		0	mit	Python
A serverless web browser which crawls websites and compares pages by schedule.
mylamour/blog	59	0	0	over 2 years ago	0		99		SCSS
Your internal mediocrity is the moment when you lost the faith of being excellent. Just do it.
rossf7/elasticrawl	50	1	0	over 9 years ago	10	February 15, 2017	1	mit	Ruby
Launch AWS Elastic MapReduce jobs that process Common Crawl data.
hfreire/browser-as-a-service	43	0	0	over 3 years ago	0		30	mit	JavaScript
A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML

Alternatives To NikolaiT/Crawling-Infrastructure

Select To Compare

NikolaiT/Crawling-Infrastructure ⭐ 321

Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.

dependent packages 0 total releases 0 most recent commit over 4 years ago

stopstalk/stopstalk-deployment ⭐ 306

Stop stalking and start StopStalking :wink:

dependent packages 0 total releases 0 most recent commit over 2 years ago

commoncrawl/cc-pyspark ⭐ 280

Process Common Crawl data with Python and Spark

dependent packages 0 total releases 0 most recent commit over 3 years ago

intoli/intoli-article-materials ⭐ 255

All of the supporting materials for articles from Intoli's blog.

dependent packages 0 total releases 0 most recent commit over 3 years ago

trek10inc/awsets ⭐ 184

A utility for crawling an AWS account and exporting all its resources for further analysis.

dependent packages 1 total releases 35 most recent commit over 3 years ago

MarcelloLins/ServerlessCrawler-VancouverRealState ⭐ 66

A Serverless Crawler For Real State Data in Vancouver Using AWS Lambda, Dynamo, RDS MySQL and CloudWatch

dependent packages 0 total releases 0 most recent commit almost 9 years ago

LeiShi1313/serverless-web-differ ⭐ 60

A serverless web browser which crawls websites and compares pages by schedule.

dependent packages 0 total releases 0 most recent commit almost 4 years ago

mylamour/blog ⭐ 59

Your internal mediocrity is the moment when you lost the faith of being excellent. Just do it.

dependent packages 0 total releases 0 most recent commit over 2 years ago

rossf7/elasticrawl ⭐ 50

Launch AWS Elastic MapReduce jobs that process Common Crawl data.

dependent packages 0 total releases 10 most recent commit over 9 years ago downloads badge

hfreire/browser-as-a-service ⭐ 43

A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML

dependent packages 0 total releases 0 most recent commit over 3 years ago

Suggest An Alternative To Crawling-Infrastructure

Alternative Project Comparisons

NikolaiT/Crawling-Infrastructure vs Crawling Infrastructure

NikolaiT/Crawling-Infrastructure vs Stopstalk Deployment

NikolaiT/Crawling-Infrastructure vs Cc Pyspark

NikolaiT/Crawling-Infrastructure vs Intoli Article Materials

NikolaiT/Crawling-Infrastructure vs Awsets

NikolaiT/Crawling-Infrastructure vs Serverlesscrawler Vancouverrealstate

NikolaiT/Crawling-Infrastructure vs Serverless Web Differ

NikolaiT/Crawling-Infrastructure vs Blog

NikolaiT/Crawling-Infrastructure vs Elasticrawl

NikolaiT/Crawling-Infrastructure vs Browser As A Service

Popular Crawler Projects

scrapy/scrapy⭐ 49,918

Scrapy, a fast high-level web crawling & scraping framework for Python.

NaiboWang/EasySpider⭐ 43,770

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化的设计和执行爬虫任务。别名：ServiceWrapper面向Web应用的智能化服务封装系统。

iawia002/lux⭐ 31,501

👾 Fast and simple video download library and CLI tool written in Go

gocolly/colly⭐ 21,443

Elegant Scraper and Crawler Framework for Golang

jhao104/proxy_pool⭐ 19,442

Python ProxyPool for web spider

Popular Amazon Web Services Projects

bregman-arie/devops-exercises⭐ 60,067

Linux, Jenkins, AWS, SRE, Prometheus, Docker, Python, Ansible, Git, Kubernetes, Terraform, OpenStack, SQL, NoSQL, Azure, GCP, DNS, Elastic, Network, Virtualization. DevOps Interview Questions

localstack/localstack⭐ 51,025

💻 A fully functional local AWS cloud stack. Develop and test your cloud & Serverless apps offline

ByteByteGoHq/system-design-101⭐ 50,529

Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.

serverless/serverless⭐ 45,767

⚡ Serverless Framework – Build web, mobile and IoT applications with serverless architectures using AWS Lambda, Azure Functions, Google CloudFunctions & more! –

danny-avila/LibreChat⭐ 38,686

Enhanced ChatGPT Clone: Features Agents, MCP, Skills, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active

Popular Data Processing Categories

Jupyter Notebook

Dataset

Sql

Validation

Pipeline

Translation

Data Science

Classification

Transaction

Scraper