Awesome Open Source

Programming Languages

Search results for scraper crawler

300 search results found

Crawley ⭐ 167

Pythonic Crawling / Scraping Framework based on Non Blocking I/O operations.

Findpapers ⭐ 164

Findpapers: A tool for helping researchers who are looking for related works

Goribot ⭐ 162

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Instagram Crawler ⭐ 157

Crawl instagram photos, posts and videos for download.

Aliexpress Product Scraper ⭐ 152

Get Aliexpress product details as a json response including feedbacks, variants, shipping info, description, images, etc.,

Site Audit Seo ⭐ 151

Web service and CLI tool for SEO site audit: crawl site, lighthouse all pages, view public reports in browser. Also output to console, json, csv, xlsx, Google Drive.

Media Scraper ⭐ 150

Scrapes all photos and videos in a web page / Instagram / Twitter / Tumblr / Reddit / pixiv / TikTok

OWASP D4N155 - Intelligent and dynamic wordlist using OSINT

Google News Scraper ⭐ 144

Lightweight scraper for Google News

estela, an elastic web scraping cluster 🕸

Domain names collector - Crawl websites and collect domain names along with their availability status.

Onegram ⭐ 136

This repository is no longer maintained.

a command-line web scraping tool

Not Your Average Web Crawler ⭐ 130

A web crawler (for bug hunting) that gathers more than you can imagine.

Grawler ⭐ 128

Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.

Double Agent ⭐ 120

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

Interactive CLI Web Crawler

Scraply ⭐ 114

Scraply a simple dom scraper to fetch information from any html based website

Od Database ⭐ 113

Distributed crawler, database and web frontend for public directories indexing

Gflare Tk ⭐ 110

Open-Source Python Based SEO Web Crawler

Zyte Smartproxy Headless Proxy ⭐ 106

A complimentary proxy to help to use SPM with headless browsers

Instagram Scraper ⭐ 105

Some scrapy spiders useful to crawl instagram posts using public APIS (No TOKEN)

Seleniumcrawler ⭐ 105

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

Node Web Crawler ⭐ 104

A web scraper with a web user interface which shows scraping stats in realtime. Uses Node.JS, jQuery, socket.io and Express.

Basketball_reference ⭐ 98

Basketball Reference Scraper

Blinkist M4a Downloader ⭐ 97

Grabs all of the audio files from all of the Blinkist books

Actor Scraper ⭐ 93

House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.

python Movie Info Web Crawler

Bots Zoo ⭐ 90

A platform displaying the latest software engineer job information to entry-level new graduates

Web Crawler for Crabs

Aliexscrape ⭐ 84

Get Aliexpress product details in JSON

Scrapper ⭐ 83

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

Ca Property Tax ⭐ 78

CA property tax visualization

Awesome Python Primer ⭐ 78

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Dorkscout ⭐ 78

DorkScout - Golang tool to automate google dork scan against the entiere internet or specific targets

Webscrapper ⭐ 77

Simple and powerfull all in one Telegram Bot to scrap webpages using Requests, html5lib and Beautifulsoup

Ceiba Dl ⭐ 77

NTU CEIBA 資料下載工具

Goodreadsscraper ⭐ 76

Scrape data from Goodreads using Scrapy and Selenium 📚

Your preferred open source focused crawler for the deep web.

Scraping Ebay ⭐ 73

Scraping Ebay's products using Scrapy Web Crawling Framework

Spotifyscraper ⭐ 72

Spotify Scraper to extract all the information from spotify, download mp3 with cover of the song

Weibo Scraper ⭐ 72

Simple Weibo Scraper

Wget Lua ⭐ 72

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Tiktok Scraper Php ⭐ 69

Tiktok (Musically) PHP scraper

Robotstxt ⭐ 68

robots.txt file parsing and checking for R

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-0

Newspaper4k ⭐ 66

📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.

Collyzar ⭐ 65

Distributed redis-based web crawler framework for colly

Recipebook ⭐ 65

This is a simple application for scraping and parsing food recipe data found on the web in hRecipe format, producing results in json

Local Api Examples ⭐ 64

Useful and easy to understand examples written in Node.js and .NET Core about web scraping and automated browsing with Kameleo Client

Dotnetcrawler ⭐ 63

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-w

Newspaperjs ⭐ 63

News extraction and scraping. Article Parsing

Screen scraping and web crawling framework

Webreaper ⭐ 59

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

Goscraper ⭐ 58

Golang pkg to quickly return a preview of a webpage (title/description/images)

Tor_spider ⭐ 57

Python project to crawl and scrap the lesser known deep web or one can say dark web. Just provide the onion link and get started.

Proxycrawl Python ⭐ 57

ProxyCrawl Python library for scraping and crawling

Social Scraper ⭐ 56

Tổng hợp script crawl dữ liệu từ các mạng xã hội & website tiếng Việt

Steam Scraper ⭐ 55

A pair of spiders for scraping product data and reviews from Steam.

Local Api Client Typescript ⭐ 54

Official JavaScript/TypeScript library for interacting with Kameleo Client

Tool Gin ⭐ 54

基于go-gin框架建立减少冗余动作项目，如：下载一些工具

Diffbot Php Client ⭐ 53

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

Local Api Client Python ⭐ 53

Official Python library for interacting with Kameleo Client

Metacritic_api ⭐ 52

PHP Metacritic API - Mirror from my GitLab

Html Table Extractor ⭐ 51

extract data from html table

Timbr_v1 ⭐ 50

A web service that turns an arbitrary web page into structural JSON data and easy-to-use APIs with just a few clicks

Tikscraperphp ⭐ 50

Wrapper for TikTok API

Scraplat ⭐ 50

I'm trying to finish the scraplat as a scraper platform

Web Scraping Framework

Tieba Zhuaqu ⭐ 49

百度贴吧分布式爬虫，用于贴吧数据挖掘。从贴吧维度和用户维度进行数据分析

Learn.scrapinghub.com ⭐ 49

Scrapinghub Learning Center. Report issues in Jira: Report issues in Jira: https://scrapinghub.atlassian.net/projects/WEB

Skyscraper ⭐ 48

An asynchronous web scraper / web crawler using async / await and Reactive Extensions

Lezhin Comics Downloader ⭐ 48

📥 Downloader for lezhin comics

Crawler Chrome Extensions ⭐ 46

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Wishlist ⭐ 46

Read an Amazon wishlist programmatically with Python

Medium Crawler ⭐ 45

A crawler for scraping posts from medium.com

Fb Page Chat Download ⭐ 44

Python script to download messages from a Facebook page to a CSV file

Browser As A Service ⭐ 43

A web browser 🌎 hosted as a service, to render your JavaScript web pages as HTML

Crawling Projects ⭐ 43

Web scraping and automation using python

Proxifier ⭐ 43

A fast, modern and intelligent proxy rotator perfect for crawling and scraping public data.

🌱 goclone - clone websites in a matter of seconds

Uber_data ⭐ 40

Uber web interface crawler / scraper - Convert the trips table into a CSV file

Scrapy.dart ⭐ 40

Scrapy, a fast high-level web crawling & scraping framework for dart and Flutter

Scrapy Distributed ⭐ 40

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Jason The Miner ⭐ 40

⛏ A versatile Web scraper for Node.js

Ronin Web ⭐ 40

ronin-web is a collection of useful web helper methods and commands.

Warta Scrap ⭐ 40

Indonesia Index News Crawler, including 10 online media

Web Scraping Tutorial ⭐ 40

Videos on youtube

Facebookcrawler ⭐ 39

Facebook Crawler - Crawl information from facebook

Caterpillar ⭐ 39

Caterpillar is a PHP library intended for website crawling and screen scraping. It handles parallel requests using the curl_multi functions.

Scrapemate ⭐ 39

Golang Crawling and scraping framework

Local Api Client Csharp ⭐ 39

This .NET Standard package provides convenient access to the Local API REST interface of the Kameleo Client.

Shopify App Store Scraper ⭐ 38

Crawler behind the Shopify App Marketplace dataset

A fast and powerful web scraping library

Searchenginescrapy ⭐ 38

Scrape data from Google.com, Bing.com, Baidu.com, Ask.com, Yahoo.com, Yandex.com

Comcrawl ⭐ 37

A python utility for downloading Common Crawl data

gRPC web crawler turbo charged for performance

:octocat:A Fast and Powerful Scraping and Web Crawling Framework.

Crawling, scraping and indexing application written in Clojure.

Related Searches

Python Crawler (4,545)

Python Scraper (3,513)

Javascript Scraper (2,047)

Scraper Scrape (1,534)

Scraper Web Crawler (1,528)

Javascript Crawler (1,142)

Crawler Spider (1,044)

Crawler Scrapy (1,002)

Java Crawler (806)

Html Scraper (757)

101-200 of 300 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.