Awesome Open Source

Programming Languages

Search results for scrape

1,397 search results found

You Get ⭐ 48,778

⏬ Dumb downloader that scrapes the web

Twint ⭐ 15,469

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a user's followers, following, Tweets and more while evading most API limitations.

Portia ⭐ 8,982

Visual scraping for Scrapy

Autoscraper ⭐ 5,159

A Smart, Automatic, Fast and Lightweight Web Scraper for Python

Scraperjs ⭐ 3,575

A complete and versatile web scraper.

Tiktok Scraper ⭐ 3,554

TikTok Scraper. Download video posts, collect user/trend/hashtag/music feed metadata, sign URL and etc.

Cloudflare Scrape ⭐ 3,229

A Python module to bypass Cloudflare's anti-bot page.

Pushgateway ⭐ 2,795

Push acceptor for ephemeral and batch jobs.

Metascraper ⭐ 2,189

Get unified metadata from websites using Open Graph, Microdata, RDFa, Twitter Cards, JSON-LD, HTML, and more.

Facebook Page Post Scraper ⭐ 2,014

Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis

Facebook Scraper ⭐ 1,936

Scrape Facebook public pages without an API key

Twitterscraper ⭐ 1,852

Scrape Twitter for Tweets

Scrapely ⭐ 1,668

A pure-python HTML screen-scraping library

Upton ⭐ 1,615

A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)

Email2phonenumber ⭐ 1,601

A OSINT tool to obtain a target's phone number just by having his email address

Linkedin_scraper ⭐ 1,534

A library that scrapes Linkedin for user data

Jobfunnel ⭐ 1,533

Scrape job websites into a single spreadsheet with no duplicates.

Scrape ⭐ 1,464

A simple, higher level interface for Go web scraping.

Styleguide ⭐ 1,461

How To Prevent Scraping ⭐ 1,417

The ultimate guide on preventing Website Scraping

Nginx Prometheus Exporter ⭐ 1,417

NGINX Prometheus Exporter for NGINX and NGINX Plus

Prometheus Basics ⭐ 1,413

A beginner friendly introduction to prometheus 🔥

Twitter Api Client ⭐ 1,149

Implementation of X/Twitter v1, v2, and GraphQL APIs

Ansible Prometheus ⭐ 1,101

Deploy Prometheus monitoring system

Loklak_scraper_js ⭐ 1,094

Scrapers for loklak in javascript

Redditdownloader ⭐ 1,045

Scrapes Reddit to download media of your choice.

Pjscrape ⭐ 1,003

A web-scraping framework written in Javascript, using PhantomJS and jQuery

Onionsearch ⭐ 989

OnionSearch is a script that scrapes urls on different .onion search engines.

Mlscraper ⭐ 935

🤖 Scrape data from HTML websites automatically by just providing examples

Twitter Scraper ⭐ 769

Scrape the Twitter frontend API without authentication with Golang.

A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...

Skrape.it ⭐ 714

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Scrapedin ⭐ 712

A tool to scrape LinkedIn without API restrictions for data reconnaissance

Robintrack ⭐ 679

Scrapes the Robinhood API to retrieve + store popularity and price data.

Python_and_the_web ⭐ 662

Build Bots, Scrape a website or use an API to solve a problem.

Voyager ⭐ 618

crawl and scrape web pages in rust

Javmoviescraper ⭐ 618

Scrape XBMC and Kodi movie metadeta and automatically rename files for Japanese Adult Videos (JAV), American Adult DVDs, and American Adult Webcontent

Universal Reddit Scraper - A comprehensive Reddit scraping command-line tool.

Haproxy_exporter ⭐ 599

Simple server that scrapes HAProxy stats and exports them via HTTP for Prometheus consumption

Imagescraper ⭐ 572

✂️ High performance, multi-threaded image scraper

Json_exporter ⭐ 567

A prometheus exporter which scrapes remote JSON by JSONPath

Automated_youtube_channel ⭐ 564

Automated youtube that can scrape content, edit a compilation, and upload to youtube daily.

Audiobooks.bundle ⭐ 554

Plex metadata scraper for Audiobooks

Instascrape ⭐ 554

Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically

Communityscrapers ⭐ 545

This is a public repository containing scrapers created by the Stash Community.

Scrapli ⭐ 537

Fast, flexible, sync/async, Python 3.7+ screen scraping client specifically for network devices

Linkedin Scraper ⭐ 512

Scrapes the public profile of the linkedin page

2020ncov_individual_archives ⭐ 510

"Every individual matters. Every individual has a role to play." This is a repository that archives the individual stories during the COVID19 pandemic. 备份普通人在疫情期间的记录。（持续翻译 & 更新中）

Se Scraper ⭐ 477

Javascript scraping module based on puppeteer for many different search engines...

Openwebtext ⭐ 463

Open clone of OpenAI's unreleased WebText dataset scraper. This version uses pushshift.io files instead of the API for speed.

Amazon Product Api ⭐ 457

Amazon Scraper. Scrape products from the amazon search result or reviews from the specific product

Oracledb_exporter ⭐ 432

Prometheus Oracle database exporter.

Node Google ⭐ 420

A Node.js module to search and scrape Google.

Scrape domain names from SSL certificates of arbitrary hosts

Grammers ⭐ 397

(tele)gramme.rs - use Telegram's API from Rust

Instamancer ⭐ 380

Scrape Instagram's API with Puppeteer

Ceph_exporter ⭐ 379

Prometheus exporter that scrapes meta information about a ceph cluster.

Instagram Scraper ⭐ 367

Instagram Scraper. Scrape useful data/posts from instagram users, hashtag and locations pages. Comments and people who liked specific posts and soon more. No login or API keys are required

Senator Filings ⭐ 366

Scrape public filings of the buy + sell orders of U.S. senators and calculate their returns

Micro Open Graph ⭐ 362

A tiny Node.js microservice to scrape open graph data with joy.

Scrape Linkedin Selenium ⭐ 353

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Hackernews ⭐ 341

There are way too many stories on Hacker News, and there's no option for "show me only the stories that Joel would like". So I built one. (Maybe "cobbled together" is more appropriate.)

Prom2json ⭐ 332

A tool to scrape a Prometheus client and dump the result as JSON.

City Scrapers ⭐ 315

Scrape, standardize and share public meetings from local government websites

Advanced python library to scrap Twitter (tweets, users) from unofficial API

Node Readability ⭐ 302

Scrape/Crawl article from any site automatically. Make any web page readable, no matter Chinese or English.

Elixir Scrape ⭐ 300

Scrape any website, article or RSS/Atom Feed with ease!

Python web scraping framework

Recreation Gov Campsite Checker ⭐ 294

Scrapes the recreation.gov website to check for campsite availabilities 🏕🏕

Apache_exporter ⭐ 291

Prometheus exporter for Apache.

Webinspector ⭐ 286

Ruby gem to inspect completely a web page. It scrapes a given URL, and returns you its meta, links, images more.

Pupflare ⭐ 267

A webpage proxy that request through Chromium (puppeteer) - can be used to bypass Cloudflare anti bot / anti ddos on any application (like curl)

Python Automation Scripts ⭐ 264

Simple yet powerful automation stuffs.

Proxy Scraper ⭐ 260

scrape proxies from more than 5 different sources and check which ones are still alive

Scraper ⭐ 254

Simple web scraping for Google Chrome.

Yahoo_fin ⭐ 251

Scrape stock price history from new (Spring 2017) Yahoo Finance layout

Openparliament ⭐ 246

Keeping tabs on Canada's Parliament

Mwoffliner ⭐ 245

Mediawiki scraper: all your wiki articles in one highly compressed ZIM file

Batocera Emulationstation ⭐ 243

Prometheus Ecs Discovery ⭐ 243

A Prometheus discoverer that scrapes Amazon ECS and a generates file SD configuration file.

Stackdriver_exporter ⭐ 240

Google Stackdriver Prometheus exporter

Rcrawler ⭐ 240

An R web crawler and scraper

Spatula ⭐ 233

A modern Python library for writing maintainable web scrapers.

Google Images Download ⭐ 231

Google/Bing Images Web Downloader

Nudecrawler ⭐ 231

Crawl telegra.ph searching for nudes!

Youtube Comment Scraper ⭐ 229

A web client that scrapes YouTube comments

Proxyscrape ⭐ 224

Python library for retrieving free proxies (HTTP, HTTPS, SOCKS4, SOCKS5).

Sota Extractor ⭐ 221

The SOTA extractor pipeline

A declarative struct-tag-based HTML unmarshaling or scraping package for Go built on top of the goquery library

Rightmove_webscraper.py ⭐ 219

Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object

Wayback Machine Scraper ⭐ 219

A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.

Amazon Scraper ⭐ 219

A simple web scraper to extract Product Data and Pricing from Amazon

Images Scraper ⭐ 218

Simple and fast scraper for Google

Skraper ⭐ 217

Kotlin/Java library and cli tool for scraping posts and media from various sources with neither authorization nor full page rendering (Facebook, Instagram, Twitter, Youtube, Tiktok, Telegram, Twitch, Reddit, 9GAG, Pinterest, Flickr, Tumblr, Coub, Vimeo, IFunny, VK, Odnoklassniki, Pikabu)

Transistor ⭐ 209

Transistor, a Python web scraping framework for intelligent use cases.

Ha Multiscrape ⭐ 198

Home Assistant custom component for scraping (html, xml or json) multiple values (from a single HTTP request) with a separate sensor/attribute for each value. Support for (login) form-submit functionality.

Gitscraper ⭐ 197

A tool which scrapes public github repositories for common naming conventions in variables, folders and files

Scrape a website efficiently, block by block, page by page. Based on cheerio and curl.

Humanoid ⭐ 191

Node.js package to bypass CloudFlare's anti-bot JavaScript challenges

Related Searches

Python Scrape (2,235)

Scraper Scrape (2,054)

Javascript Scrape (913)

1-100 of 1,397 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.