Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for scraper
scraper
x
4,239 search results found
Scrapy
⭐
49,918
Scrapy, a fast high-level web crawling & scraping framework for Python.
Huginn
⭐
41,465
Create agents that monitor and act on your behalf. Your agents are standing by!
Devdocs
⭐
33,315
API Documentation Browser
Cheerio
⭐
27,702
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
Lux
⭐
24,752
👾 Fast and simple video download library and CLI tool written in Go
Colly
⭐
21,902
Elegant Scraper and Crawler Framework for Golang
Easyspider
⭐
20,149
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化
Newspaper
⭐
13,147
News, full-text, and article metadata extraction in Python 3. Advanced docs:
Requests Html
⭐
13,100
Pythonic HTML Parsing for Humans™
Crawlee
⭐
12,106
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
Webmagic
⭐
11,080
A scalable web crawler framework for Java.
Jsoup
⭐
10,463
jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
Chinese Xinhua
⭐
10,425
📙 中华新华字典数据库。包括歇后语,成语,词语,汉字。
Portia
⭐
8,982
Visual scraping for Scrapy
Avbook
⭐
8,777
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
Undetected Chromedriver
⭐
7,232
Custom Selenium Chromedriver | Zero-Config | Passes ALL bot mitigation systems (like Distil / Imperva/ Datadadome / CloudFlare IUAM)
Tabula
⭐
6,488
Tabula is a tool for liberating data tables trapped inside PDF files
Awesome Web Scraping
⭐
6,060
List of libraries, tools and APIs for web scraping and data processing.
Awesome Crawler
⭐
5,859
A collection of awesome web crawler,spider in different languages
Ferret
⭐
5,540
Declarative web scraping
Autoscraper
⭐
5,159
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
Headless Chrome Crawler
⭐
5,051
Distributed crawler powered by Headless Chrome
Douyin_tiktok_download_api
⭐
4,844
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、T
Rod
⭐
4,505
A Devtools driver for web automation and scraping
Mechanize
⭐
4,338
Mechanize is a ruby library that makes automated web interaction easy.
Mygptreader
⭐
4,267
A community-driven way to read and chat with AI bots - powered by chatGPT.
Python Scraping
⭐
4,136
Code samples from the book Web Scraping with Python http://shop.oreilly.com/product/0636920034391.do
Node Ytdl Core
⭐
4,087
YouTube video downloader in javascript.
Node Osmosis
⭐
4,083
Web scraper for NodeJS
Snscrape
⭐
3,992
A social networking service scraper in Python
Scrape It
⭐
3,978
🔮 A Node.js scraper for humans.
Data Science
⭐
3,898
Collection of useful data science topics along with articles, videos, and code
Scraperjs
⭐
3,575
A complete and versatile web scraper.
Tiktok Scraper
⭐
3,554
TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.
Fake Useragent
⭐
3,356
Up-to-date simple useragent faker with real world database
Browser Fingerprinting
⭐
3,353
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
Automatic Udemy Course Enroller Get Paid Udemy Courses For Free
⭐
3,010
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
Python
⭐
2,978
Python Books && Courses
Instagram Php Scraper
⭐
2,928
Get account information, photos, videos, stories and comments.
Tweets_analyzer
⭐
2,894
Tweets metadata scraper & activity analyzer
Panther
⭐
2,849
A browser testing and web crawling library for PHP and Symfony
Emby.plugins.javscraper
⭐
2,687
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
Querylist
⭐
2,598
🕷️ The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
Googlescraper
⭐
2,540
A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
Snoop
⭐
2,530
Snoop — инструмент разведки на основе открытых данных (OSINT world)
Aos Avp
⭐
2,515
NOVA opeN sOurce Video plAyer: main repository to build them all
Trafilatura
⭐
2,447
Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments
Grab
⭐
2,292
Web Scraping Framework
Thal
⭐
2,268
Getting started with Puppeteer and Chrome Headless for Web Scraping
Weibo_terminater
⭐
2,265
Final Weibo Crawler Scrap Anything From Weibo, comments, weibo contents, followers, anything. The Terminator
Awesome Puppeteer
⭐
2,245
A curated list of awesome puppeteer resources.
Bulk Downloader For Reddit
⭐
2,142
Downloads and archives content from reddit
Freedictionaryapi
⭐
2,115
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
Google Play Scraper
⭐
2,108
Node.js scraper to get data from Google Play
Soup
⭐
2,074
Web Scraper in Go, similar to BeautifulSoup
Embed
⭐
2,052
Get info from any web service or page
Facebook Page Post Scraper
⭐
2,014
Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
Facebook Scraper
⭐
1,936
Scrape Facebook public pages without an API key
Geziyor
⭐
1,892
Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.
Twitterscraper
⭐
1,852
Scrape Twitter for Tweets
Node.io
⭐
1,845
Nyt 2020 Election Scraper
⭐
1,788
Content
⭐
1,711
Official content for Harvard CS109
Scrapely
⭐
1,668
A pure-python HTML screen-scraping library
Scraper
⭐
1,639
HTML parsing and querying with CSS selectors
Upton
⭐
1,615
A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)
Linkedin_scraper
⭐
1,534
A library that scrapes Linkedin for user data
Jobfunnel
⭐
1,533
Scrape job websites into a single spreadsheet with no duplicates.
Scrape
⭐
1,464
A simple, higher level interface for Go web scraping.
Tomorrow
⭐
1,463
Magic decorator syntax for asynchronous code in Python
Node Website Scraper
⭐
1,456
Download website to local directory (including all css, images, js, etc.)
Rvest
⭐
1,434
Simple web scraping for R
Snmp_exporter
⭐
1,433
SNMP Exporter for Prometheus
How To Prevent Scraping
⭐
1,417
The ultimate guide on preventing Website Scraping
Recipe Scrapers
⭐
1,408
Python package for scraping recipes data
Article Extractor
⭐
1,297
To extract main article from given URL with Node.js
Wombat
⭐
1,297
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Jd Autobuy
⭐
1,291
Python爬虫,京东自动登录,在线抢购商品
Shot Scraper
⭐
1,285
A command-line utility for taking automated screenshots of websites
Cloudproxy
⭐
1,235
Hide your scrapers IP behind the cloud. Provision proxy servers across different cloud providers to improve your scraping success.
Cariddi
⭐
1,228
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
Gdom
⭐
1,180
DOM Traversing and Scraping using GraphQL
Cinemagoer
⭐
1,156
Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies
Informer
⭐
1,141
A Telegram Mass Surveillance Bot in Python
Scrapy Cluster
⭐
1,137
This Scrapy project uses Redis and Kafka to create a distributed on demand scraping cluster.
Event Collect
⭐
1,110
event website listing to Open Event format scraper and converter
Animdl
⭐
1,105
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
Fansly Downloader
⭐
1,103
Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts 👍
Loklak_scraper_js
⭐
1,094
Scrapers for loklak in javascript
Newpipeextractor
⭐
1,070
NewPipe's core library for extracting data from streaming sites
Django Dynamic Scraper
⭐
1,069
Creating Scrapy scrapers via the Django admin interface
Scanless
⭐
1,061
online port scan scraper
Parliament Scraper
⭐
1,049
Public Data Scraper for Parliament Data for the EU and other Parliaments
Crawler User Agents
⭐
1,045
Syntactic patterns of HTTP user-agents used by bots / robots / crawlers / scrapers / spiders. pull-request welcome ⭐
Redditdownloader
⭐
1,045
Scrapes Reddit to download media of your choice.
Osi.ig
⭐
1,027
Information Gathering Instagram.
Artoo
⭐
1,024
artoo.js - the client-side scraping companion.
Parsel
⭐
1,010
Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors
Pjscrape
⭐
1,003
A web-scraping framework written in Javascript, using PhantomJS and jQuery
Mangal
⭐
981
📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
Related Searches
Python Scraper (5,696)
Javascript Scraper (2,047)
Scraper Scrape (1,534)
Scraper Web Crawler (1,528)
Scraper Crawler (904)
Html Scraper (757)
1-100 of 4,239 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.