Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for amazon web services crawler
amazon-web-services
x
crawler
x
0 search results found
Crawling Infrastructure
⭐
321
Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.
Stopstalk Deployment
⭐
306
Stop stalking and start StopStalking 😉
Cc Pyspark
⭐
280
Process Common Crawl data with Python and Spark
Intoli Article Materials
⭐
255
All of the supporting materials for articles from Intoli's blog.
Awsets
⭐
184
A utility for crawling an AWS account and exporting all its resources for further analysis.
Serverlesscrawler Vancouverrealstate
⭐
66
A Serverless Crawler For Real State Data in Vancouver Using AWS Lambda, Dynamo, RDS MySQL and CloudWatch
Serverless Web Differ
⭐
60
A serverless web browser which crawls websites and compares pages by schedule.
Blog
⭐
59
Your internal mediocrity is the moment when you lost the faith of being excellent. Just do it.
Elasticrawl
⭐
50
Launch AWS Elastic MapReduce jobs that process Common Crawl data.
Browser As A Service
⭐
43
A web browser 🌎 hosted as a service, to render your JavaScript web pages as HTML
Wikireverse
⭐
39
Hadoop jobs for WikiReverse project. Parses Common Crawl data for links to Wikipedia articles.
Serverless Instagram Crawler
⭐
29
serverless, instagram hashtag crawler with lambda, dynamoDB
Lambda Dynamic Prerenderer
⭐
28
Dynamically prerender pages for bots and crawlers, with Lambda@Edge, S3 and CloudFront. No more need for isomorphic/server-side rendering!
Pokemongo Map Poc
⭐
27
🎃 POC project for Pokemon Go map
Utsusemi
⭐
27
A tool to generate a static website by crawling the original site.
Serverless Crawler Demo
⭐
27
Serverless Architecture Crawler demo
Staticizer
⭐
27
A tool to create a static version of a website for hosting on S3.
Steam_recommendation_system
⭐
25
Recommendation System, Collaborative Filtering, Spark, Hive, Flask, Web Crawler, AWS EC2, AWS RDS
Pywren Workshops
⭐
23
Various workshop labs that make use of pywren to massively process data in parallel with AWS Lambda
Nutch Aws
⭐
23
Teneo
⭐
22
Amazon S3 Step Functions Ingestion Orchestration
⭐
19
Design pattern for orchestrating an incremental data ingestion pipeline using AWS Step Functions from an on premise location into an Amazon S3 datalake bucket
Damons Data Lake
⭐
18
All the code related to building my own data lake
Cc Lambda
⭐
16
Search the common crawl using lambda functions
Googleplay Web Crawler
⭐
15
Mapreduce project by Hadoop, Nutch, AWS EMR, Pig, Tez, Hive
Glutil
⭐
13
Utilities for managing AWS Glue/Athena tables and partitions stored in S3
Pouch
⭐
12
Find websites with script URLs matching given regex
Aws Fargate Demo
⭐
10
AWS fargate demo for AWSKRUG-recap
Building A Data Lake With Aws Glue And Amazon S3
⭐
10
Fragmenty
⭐
9
an infrastructure for crawling, exposing api and visualizing Fragment.com/numbers data
Quicksightathena01
⭐
9
Amazon QuickSight and Amazon Athena workshop. Workshop will focus on ingesting data into Athena, combining it with other data sources, and visualizaing it in QuickSight.
Zmon Aws Agent
⭐
9
AWS API crawler to auto discover running services in your account
Serverlessnycparkseventssitecrawler
⭐
8
Frontendmasters Crawler
⭐
8
A demo of a serverless crawler built on AWS Lambda (scheduled tasks) and store results in S3
Lastfm Scrobble Purger
⭐
7
A tool for mass deleting last.fm scrobbles
Reinvent2018_aim416
⭐
6
AIM416 workshop material for AWS re:Invent 2018
Common Crawl Malayalam
⭐
5
Useful tools to extract malayalam text from the Common Crawl Datasets
Nutchpighive
⭐
5
crawl GooglePlay data with Nutch, ETL with Pig, analyze with Hive
Distributed Web Crawler With Celery
⭐
5
Python: selenium, beautifulsoup2, celery, rabbitmq, Amazon AWS(EC2, S3)
Webcrawler
⭐
5
A Recursive Web crawler built with Java 8, reactive streams, async queues and AWS DynamoDB.
1-0 of 0 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.