Awesome Open Source

Programming Languages

Search results for ruby crawler

83 search results found

Arachni ⭐ 3,632

Web Application Security Scanner Framework

Anemone ⭐ 1,615

Anemone web-spider framework

Wombat ⭐ 1,297

Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.

Kimuraframework ⭐ 874

Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Device_detector ⭐ 711

DeviceDetector is a precise and fast user agent parser and device detector written in Ruby

Fast high-level web crawling Ruby framework

Bounty Targets ⭐ 522

This project crawls bug bounty platform scopes (like Hackerone/Bugcrowd/Intigriti/etc) hourly and dumps them into the bounty-targets-data repo

Tarantula ⭐ 453

a big hairy fuzzy spider that crawls your site, wreaking havoc

Playdrone ⭐ 405

Google Play Crawler

Prerender_rails ⭐ 358

Rails middleware gem for prerendering javascript-rendered pages on the fly for SEO

Archivebot ⭐ 328

ArchiveBot, an IRC bot for archiving websites

Woothee ⭐ 304

User-Agent parser/classifier for multi languages

Harvestman ⭐ 278

Quick and dirty web crawling.

A Ruby DSL for structured web crawling, with a robust caching system.

Rawler is a tool that crawls the links of your website

A loose framework for crawling and scraping web sites.

Voight Kampff ⭐ 171

Voight-Kampff is a Ruby gem that detects bots, spiders, crawlers and replicants

Instagram Crawler ⭐ 157

Crawl instagram photos, posts and videos for download.

Rubyretriever ⭐ 139

Asynchronous Web Crawler & Scraper

Saushengine.v1 ⭐ 118

Simple Ruby-based search engine

Crawler_detect ⭐ 106

Ruby gem to detect bots and crawlers via the user agent

Human_power ⭐ 97

Easy generation of robots.txt. Force the robots into submission!

Polipus: distributed and scalable web-crawler framework

Malheatmap ⭐ 87

An extension for tracking your activities on myanimelist.net

Environment specific robots.txt for your Rails Apps

Woothee Ruby ⭐ 70

Woothee ruby implementation

Bucketlist ⭐ 68

Amazon S3 bucket spelunking!

Modsec Flameeyes ⭐ 68

Flameeyes's Ruleset for ModSecurity

Google Ajax Crawler ⭐ 58

Rack Middleware adhering to the Google Ajax Crawling Scheme, using a headless browser to render JS heavy pages and serve a dom snapshot of the rendered state to a requesting search engine.

A client library for pixiv

Netcrawl ⭐ 51

LLDP/CDP crawler

Snapcrawl ⭐ 51

Crawl a website and take screenshots

Elasticrawl ⭐ 50

Launch AWS Elastic MapReduce jobs that process Common Crawl data.

Regexp_crawler ⭐ 45

A crawler which uses regular expression to catch data from website.

Cosmicrawler ⭐ 44

Cosmicrawler is crawler library for Ruby. It provides scalable asynchronous crawling by (http|file|etc) using EventMachine.

Ronin Web ⭐ 40

ronin-web is a collection of useful web helper methods and commands.

##crawl bot Sequell; depends on https://github.com/crawl/go-sequell

Wayback_archiver ⭐ 39

Ruby gem to send URLs to Wayback Machine

Tosback3 ⭐ 39

ToSBack crawls, archives and tracks changes in terms of service and privacy policies. ToSBack3, inspired by EFF's ToSBack, is built in Ruby on Rails, features a web interface.

Arachnid ⭐ 38

Extremely fast and efficient Ruby domain spider

A no longer functioning library for parsing and crawling LCBO.com

Appium Native Crawler ⭐ 38

Appium Native Crawler CLI - Features include: Screenshots, Performance, Accessibility Detection, Google Translate, Applitools, Monkey Tester

Otvorenesudy ⭐ 37

Open Courts Rails Application

Web crawler with a Ruby API

Is_crawler ⭐ 30

is_crawler does exactly what you might think it does: determine if the supplied string matches a known crawler or bot.

Craigslist Ruby Crawler ⭐ 29

Ruby Crawler that crawls Craigslist for specific keywords and returns HTML output with a list of links

Playdrone Kitchen ⭐ 28

Kitchen for the Google Play Crawler cluster

Common_crawl_types ⭐ 28

A simple Ruby example of how to process Common Crawl files using Elastic MapReduce

Staticizer ⭐ 27

A tool to create a static version of a website for hosting on S3.

A simple Ruby directory crawler DSL

Ccc_privacy_crawler ⭐ 26

Tカードの個人情報提供の停止対象企業一覧の新着を通知するためのTwitterボットです

Pinterest_crawler ⭐ 25

crawling pinterest

Kontests ⭐ 25

Competitive programming contests schedule

Ofxaddons.com ⭐ 25

Web app for indexing OpenFrameworks addons

Counters ⭐ 23

Easily record any metric in your system

Commit Crawler ⭐ 23

Crawler for GitHub commit messages

Hashbang ⭐ 22

Magic support of Google/Bing/... AJAX search indexing for your apps

A mean little DSL'd poltergeist (capybara) based web crawler that stuffs data into your Rails app.

Legitbot ⭐ 19

🤔 Is this Web request from a real search engine🕷 or from an impersonating agent 🕵️‍♀️?

crawler by file extension

Kimurai is a modern web scraping framework written in Ruby which works out of box with headless chromium/firefox, phantomjs, or simple HTTP requests and allows to scrape and interact with javascript rendered websites

Knife Crawl ⭐ 18

knife plugin to display role hierarchies

tweetlr crawls twitter for a given term, extracts photos out of the collected tweets' short urls and posts the images to tumblr. nice!

Installer ⭐ 17

Installation script for Codename SCNR.

Domain_crawl ⭐ 17

Crawl an entire domain with Zillabyte

Simple async HTTP crawler based on em-synchrony

Javlibrary ⭐ 16

Javlibrary-Crawler lib(gem), easy way to build you own jav-database.

Crawler and content extractor for building a full text index of a website's contents. Uses Ferret for indexing.

Facebook Cleaner ⭐ 16

Remove (most) personal data from Facebook

Spiderman ⭐ 16

your friendly neighborhood web crawler

Spidey Mongo ⭐ 15

Implements a MongoDB back-end for Spidey (https://github.com/joeyAghion/spidey), a framework for crawling and scraping web sites.

Proxycrawl Ruby ⭐ 14

ProxyCrawl API ruby gem for scraping and crawling

Rails Crawler ⭐ 14

Crawls a Rails project looking for dead links, unused routes or other problems

Saushengine ⭐ 14

Email_spider ⭐ 14

A simple Ruby web spider that uses Anemone to crawl every page of a site looking for email addresses. Stores the results with SQLite3 using Data Mapper.

A simple networked dungeon crawler

Nicopodcast ⭐ 13

make a podcast from a mylist

The Gatherer ⭐ 12

Ruby based framework to streamline data collection, storage and analysis tasks.

Instagram Crawler ⭐ 12

Short Ruby scripts to download images and videos from Instagram by crawling users or hashtags

Proxy_manager ⭐ 12

Ruby proxy manager. Gem for easy usage proxy in parser/web bots.

Deschutes ⭐ 11

Obey_robots_dot_txt ⭐ 11

Easy to use extension of Net::HTTP to let you obey robots.txt while crawling/scrapping/mining

Yet another dirbuster tool

Webmaster_tools ⭐ 10

Gives access to Webmaster Tools Interface data programmatically which is not provided by the official API

VersionEye crawlers implemented in Ruby.

Wgit allows you to crawl and extract the data you want from the web

Crawl_station ⭐ 9

basil(isk): a front-end for the anemone web crawler.

Roundabout ⭐ 9

The Roundabout crawler is an experiment on high-performance distributing techniques and their feasibility when is comes to website crawling.

Eoshub.io ⭐ 9

Linkedincrawler ⭐ 9

Crawls public LinkedIn profiles

Diy_twitter_client ⭐ 9

a simple twitter client that learns what i like to read

Fashion_check_ranking ⭐ 8

Dead simple yet powerful Ruby crawler for easy parallel crawling with support for an anonymity.

Simple Ruby web crawler

Arachnid2 ⭐ 8

a simple, fast web-crawler written in Ruby using Watir or Typhoeus

Rails_analyzer_tools ⭐ 7

An example job that converts Common Crawl archived web pages into text

Ronin Web Spider ⭐ 7

A collection of common web spidering routines

Related Searches

Ruby Command Line (35,999)

Javascript Ruby (6,657)

Ruby Plugin (6,573)

Ruby Chef (4,661)

Python Crawler (4,545)

Ruby Testing (4,020)

Ruby Sinatra (3,377)

Ruby Rspec (3,278)

Ruby Activerecord (3,234)

Ruby Heroku (2,926)

1-83 of 83 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.