Awesome Open Source

Programming Languages

Search results for scraper spider

115 search results found

Colly ⭐ 21,902

Elegant Scraper and Crawler Framework for Golang

Easyspider ⭐ 20,149

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化

Avbook ⭐ 8,777

AV 电影管理系统， avmoo , javbus , javlibrary 爬虫，线上 AV 影片图书馆，AV 磁力链接数据库，Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

Awesome Web Scraping ⭐ 6,060

List of libraries, tools and APIs for web scraping and data processing.

Awesome Crawler ⭐ 5,859

A collection of awesome web crawler,spider in different languages

Douyin_tiktok_download_api ⭐ 4,844

🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、T

Browser Fingerprinting ⭐ 3,353

Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️‍♂️ when scraping the web?

Querylist ⭐ 2,598

🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。

Web Scraping Framework

Geziyor ⭐ 1,892

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

Scrapy Cluster ⭐ 1,137

This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.

Django Dynamic Scraper ⭐ 1,069

Creating Scrapy scrapers via the Django admin interface

Crawler User Agents ⭐ 1,045

Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome ⭐

Querido Diario ⭐ 944

📰 Diários oficiais brasileiros acessíveis a todos | 📰 Brazilian government gazettes, accessible to everyone.

Crawler ⭐ 897

A high performance web crawler / scraper in Elixir.

Kimuraframework ⭐ 874

Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites

Scrapyrt ⭐ 793

HTTP API for Scrapy spiders

Crawly, a high-level web crawling & scraping framework for Elixir.

A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.

Linkedin ⭐ 602

Linkedin Scraper using Selenium Web Driver, Chromium headless, Docker and Scrapy

Newcrawler ⭐ 583

Free Web Scraping Tool with Java

OSINT Swiss Army Knife

Alltheplaces ⭐ 502

A set of spiders and scrapers to extract location information from places that post their location on the internet.

Spidermon ⭐ 486

Scrapy Extension for monitoring spiders execution.

Webster ⭐ 465

a reliable high-level web crawling & scraping framework for Node.js.

Awesome Scrapy ⭐ 450

A curated list of awesome packages, articles, and other cool resources from the Scrapy community.

The fastest web crawler written in Rust. Maintained by @a11ywatch.

Fbcrawl ⭐ 415

A Facebook crawler

Linkedin Profile Scraper Api ⭐ 404

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

Linkedindumper ⭐ 337

Python 3 script to dump/scrape/extract company employees from LinkedIn API

Freshonions Torscraper ⭐ 313

Fresh Onions is an open source TOR spider / hidden service onion crawler hosted at zlal32teyptf4tvi.onion

[WIP] GOPA, a spider written in Golang, for Elasticsearch. DEMO: http://index.elasticsearch.cn

Java Spider ⭐ 276

一个基于webmagic框架二次开发的java爬虫框架实战，已实现能爬取腾讯，搜狐，今日头条（单独集

A web crawler for Go

一个灵活、友好的爬虫框架

Dark Fantasy Hack Tool ⭐ 248

DDOS Tool: To take down small websites with HTTP FLOOD. Port scanner: To know the open ports of a site. FTP Password Cracker: To hack file system of websites.. Banner Grabber: To get the service or software running on a port. (After knowing the software running google for its vulnerabilities.) Web Spider: For gathering web application hacking information. Email scraper: To get all emails related to a webpage IMDB Rating: Easy way to access the movie database. Both .exe(compressed as zip) and .py

Awesome Crawler Cn ⭐ 243

互联网爬虫，蜘蛛，数据采集器，网页解析器的汇总，因新技术不断发展，新框架层出不穷，此文会不断更新..

Nudecrawler ⭐ 231

Crawl telegra.ph searching for nudes!

Wayback Machine Scraper ⭐ 219

A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

Readablewebproxy ⭐ 195

Rewriting web proxy and archival tool. At this point, it just tries to download all the things.

A loose framework for crawling and scraping web sites.

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

Scheduler of spiders for scraping and parsing HTML and JSON pages

Goribot ⭐ 162

[Crawler/Scraper for Golang]🕷A lightweight distributed friendly Golang crawler framework.一个轻量的分布式友好的 Golang 爬虫框架。

Search Engine Google ⭐ 155

🕷️ Google client for SERPS

Aliexpress Product Scraper ⭐ 152

Get Aliexpress product details as a json response including feedbacks, variants, shipping info, description, images, etc.,

Linkedin Scraper ⭐ 143

A playwright bot which is implemented to scrape linkedin and store advertisement data in a database and telegram channel

Domain names collector - Crawl websites and collect domain names along with their availability status.

Not Your Average Web Crawler ⭐ 130

A web crawler (for bug hunting) that gathers more than you can imagine.

Youtube Watch History Scraper ⭐ 126

Scrapy YouTube watch history spider. Because YouTube didn't have a history search.

Linkedinscraper ⭐ 112

Scrapes public information off of LinkedIn

Instagram Scraper ⭐ 105

Some scrapy spiders useful to crawl instagram posts using public APIS (No TOKEN)

Blinkist M4a Downloader ⭐ 97

Grabs all of the audio files from all of the Blinkist books

Aliexscrape ⭐ 84

Get Aliexpress product details in JSON

Openscraper ⭐ 80

An open source webapp for scraping: towards a public service for webscraping

Awesome Python Primer ⭐ 78

自学入门 Python 优质中文资源索引，包含书籍 / 文档 / 视频，适用于爬虫 / Web / 数据分析 / 机器学习方向

Wget Lua ⭐ 72

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

Distributed Multi User Scrapy System With A Web Ui ⭐ 71

Django based application that allows creating, deploying and running Scrapy spiders in a distributed manner

Scrapy Wayback Machine ⭐ 70

A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

Transfermarkt Scraper ⭐ 69

🕸️ Collects data from Transfermarkt website

Robotstxt ⭐ 68

robots.txt file parsing and checking for R

Scrapy S3pipeline ⭐ 66

Scrapy pipeline to store chunked items into Amazon S3 or Google Cloud Storage bucket.

A Rust crate for manipulating HTML with CSS selectors

Scrapy Spider Example ⭐ 62

Scrapy spider example for Scrapy Tutorial Series

Steam Scraper ⭐ 55

A pair of spiders for scraping product data and reviews from Steam.

Tool Gin ⭐ 54

基于go-gin框架建立减少冗余动作项目，如：下载一些工具

Pythonscrapybasicsetup ⭐ 54

Basic setup with random user agents and IP addresses for Python Scrapy Framework.

Learn.scrapinghub.com ⭐ 49

Scrapinghub Learning Center. Report issues in Jira: Report issues in Jira: https://scrapinghub.atlassian.net/projects/WEB

Scalawebscraper ⭐ 48

Scala Webscraper

Crawler Chrome Extensions ⭐ 46

爬虫工程师常用的 Chrome 插件 | Chrome extensions used by crawler developer

Scrapy (Python Framework) Example using reddit.com

Caseharvester ⭐ 44

AWS-based application for scraping the Maryland Judiciary Case Search

Transparencia Gov Br ⭐ 41

Scraper do Portal da Transparência do Governo Federal, em Python 3

Scrapy Distributed ⭐ 40

A series of distributed components for Scrapy. Including RabbitMQ-based components, Kafka-based components, and RedisBloom-based components for Scrapy.

Ronin Web ⭐ 40

ronin-web is a collection of useful web helper methods and commands.

图片爬取下载工具，极速爬取下载站酷https://www.zcool.com.cn/, CNU 视觉 http://www.cnu.cc/ 设计师/用户上传的图片/照片/插画。

Scrapyd Authenticated ⭐ 39

Docker container running scrapyd with HTTP authentication

Scrapemate ⭐ 39

Golang Crawling and scraping framework

A fast and powerful web scraping library

Tigerspider ⭐ 36

tigerspider: a fast high-level screen scraping and web crawling framework for Python.

:octocat:A Fast and Powerful Scraping and Web Crawling Framework.

Go Crawler ⭐ 33

A web crawling framework implemented in Golang, it is simple to write and delivers powerful performance. It comes with a wide range of practical middleware and supports various parsing and storage methods. Additionally, it supports distributed deployment. 基于golang实现的爬虫框架，编写简单，性能强劲。内置了丰富的实用中间件，支持多种解析、保存方式，

Imdbspider ⭐ 33

A Scrapy spider for scraping IMDB movie info

Noscrape ⭐ 32

obfuscate text via node to make scraping your content really difficult

A web crawler / scraper engine written in Golang

This was the night of the crawling terror!

Spider is a Web spidering library for Ruby. It handles the robots.txt, scraping, collecting, and looping so that you can just handle the data.

Scrapeops Scrapy Sdk ⭐ 27

Scrapy extension that gives you all the scraping monitoring, alerting, scheduling, and data validation you will need straight out of the box.

Webmagician Ui ⭐ 26

An admin UI project for a configurable web crawler platform

A web spider for shodan.io without using the Developer API.

Scrapy Scrapingbee ⭐ 26

JavaScript support and proxy rotation for Scrapy with ScrapingBee.

Soundcloud Scraper ⭐ 24

A Scrapy spider to scrape user and track information from SoundCloud.

Scrapebox ⭐ 23

A simple, system independent infrastructure for performing web scraping. Utilizes Vagrant virtualbox interface and puppet provisioning to create and execute scraping of web content to structured data quickly and easily without modifying your core system.

Assessor Scraper ⭐ 22

A project to scrape the assessor's website and make the data accessible for advanced queries

Board Game Scraper ⭐ 21

Board game data scraper

《老色批》 "学习新思想，争做新色批"、"我每天都要看妞，没有别的想法，只是为了我的心情愉悦~"、"晚睡晚起不锻

Pricemory ⭐ 20

Tracking and display of price history of products from Paraguay

Anime Tracker ⭐ 20

🕸️ All in one place to track your favorite animes

Detectorist Scraper ⭐ 19

A scrapy spider to extract post, thread, and user information from a vBulletin forum to a MongoDB database.

This Platform Search Thousands Of Job Boards In Different Technologies From Over The World .

Related Searches

Python Scraper (5,491)

Python Spider (2,155)

Scraper Scrape (1,992)

Scraper Web Crawler (1,528)

Javascript Scraper (1,441)

Crawler Spider (1,107)

Spider Scrapy (982)

Scraper Crawler (904)

Html Scraper (759)

1-100 of 115 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.