Awesome Open Source

Programming Languages

Search results for html scraper

142 search results found

Cheerio ⭐ 27,702

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

Easyspider ⭐ 20,149

A visual no-code/code-free web crawler/spider易采集：一个可视化浏览器自动化测试/数据采集/爬虫软件，可以无代码图形化

Requests Html ⭐ 13,100

Pythonic HTML Parsing for Humans™

Jsoup ⭐ 10,463

jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

Aos Avp ⭐ 2,515

NOVA opeN sOurce Video plAyer: main repository to build them all

Nyt 2020 Election Scraper ⭐ 1,788

Scrapely ⭐ 1,668

A pure-python HTML screen-scraping library

Scraper ⭐ 1,639

HTML parsing and querying with CSS selectors

Upton ⭐ 1,615

A batteries-included framework for easy web-scraping. Just add CSS! (Or do more.)

Rvest ⭐ 1,434

Simple web scraping for R

How To Prevent Scraping ⭐ 1,417

The ultimate guide on preventing Website Scraping

Parsel ⭐ 1,010

Parsel lets you extract data from XML/HTML documents using XPath or CSS selectors

Mlscraper ⭐ 935

🤖 Scrape data from HTML websites automatically by just providing examples

Website Downloader ⭐ 895

💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js

Gazpacho ⭐ 716

🥫 The simple, fast, and modern web scraping library

Skrape.it ⭐ 714

A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion.

Scala Scraper ⭐ 710

A Scala library for scraping content from HTML pages

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

Jekyll-based static site for The Programming Historian

Se Scraper ⭐ 477

Javascript scraping module based on puppeteer for many different search engines...

Pywebcopy ⭐ 455

Locally saves webpages to your hard disk with images, css, js & links as is.

Opensanctions ⭐ 427

An open database of international sanctions data, persons of interest and politically exposed persons

Ultimate Web Scraper ⭐ 400

A PHP library/toolkit designed to handle all of your web scraping needs under a MIT or LGPL license. Also has web server and WebSocket server classes for building custom servers.

Basketball_reference_web_scraper ⭐ 382

NBA Stats API via Basketball Reference

Lambdasoup ⭐ 375

Functional HTML scraping and rewriting with CSS in OCaml

Scrape Linkedin Selenium ⭐ 353

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.

Hquery.php ⭐ 345

An extremely fast web scraper that parses megabytes of invalid HTML in a blink of an eye. PHP5.3+, no dependencies.

Juriscraper ⭐ 314

An API to scrape American court websites for metadata.

Scrapysharp ⭐ 307

reborn of https://bitbucket.org/rflechner/scrapysharp

Elixir Scrape ⭐ 300

Scrape any website, article or RSS/Atom Feed with ease!

Leetcode ⭐ 275

Leetcode Questions - Sorted by likes, likes-dislikes ratio and much more

Youtube Projects ⭐ 272

This repository contains all the code I use in my YouTube tutorials.

Torrent Search Api ⭐ 258

Yet another node torrent scraper (supports iptorrents, torrentleech, torrent9, torrentz2, 1337x, thepiratebay, Yggtorrent, TorrentProject, Eztv, Yts, LimeTorrents)

Tagsoup ⭐ 229

Haskell library for parsing and extracting information from (possibly malformed) HTML/XML documents

Requests Html ⭐ 207

Pythonic HTML Parsing for Humans™

Scrape a website efficiently, block by block, page by page. Based on cheerio and curl.

Daath Ai Parser ⭐ 184

Daath AI Parser is an open-source application that uses OpenAI to parse visible text of HTML elements.

Scheduler of spiders for scraping and parsing HTML and JSON pages

Unhtml.rs ⭐ 173

A magic html parser

Extract data or evaluate value from HTML/XML documents using XPath

LOOKING FOR A MAINTAINER

Web scraping made simple.

Nibbler ⭐ 142

A cute HTML scraper / data extraction tool in under 70 lines of code

Htmlsql ⭐ 121

htmlSQL is a experimental PHP library which allows you to access HTML values by an SQL like syntax.

Go Latest ⭐ 120

Simple way to check version is latest or not from various sources in Golang

Html Metadata ⭐ 115

MetaData html scraper and parser for Node.js (supports Promises and callback style)

Nimquery ⭐ 111

Nim library for querying HTML using CSS-selectors (like JavaScripts document.querySelector)

Python Web Scraping Cookbook ⭐ 107

Python Web Scraping Cookbook, published by Packt

Html2rss ⭐ 106

📰 Build RSS 2.0 feeds from websites (and JSON APIs) with a few CSS selectors.

Web Scraper ⭐ 97

Perl web scraping toolkit

Daily Scraper ⭐ 93

Fetches information about every webpage 🤖

AnimeEZ - An Anime Streaming website without any ads for free (Demo - https://animeez.live) BTW ITS MADE IN HTML

Openscraper ⭐ 80

An open source webapp for scraping: towards a public service for webscraping

Google Covid19 Mobility Reports ⭐ 79

Data extraction of Google's COVID-19 Mobility Reports

Tatooine ⭐ 78

A powerful scraper for JavaScript Developers.

Intelligent Web Data Extractor

Your preferred open source focused crawler for the deep web.

Meteor Scrape ⭐ 73

Scrape any Website or RSS/Atom-Feed with ease.

Laravel Intelligent Scraper ⭐ 72

Service to scrape a web page easily without knowing their HTML structure.

Autoscrape Py ⭐ 70

An automated, programming-free web scraper for interactive sites

Thinkdiff ⭐ 68

My open source project links, programming and software development related code and tutorials are in this repo. Content types: Python, JavaScript, Dart | Django, React, Flutter, React-Native etc.

Top Github Scraper ⭐ 67

Scape top GitHub repositories and users based on keywords

Analysis of most poisoned names in US

Comicsrss.com ⭐ 66

RSS feeds for comics

Newspaper4k ⭐ 66

📰 Newspaper4k a fork of the beloved Newspaper3k. Extraction of articles, titles, and metadata from news websites.

A Rust crate for manipulating HTML with CSS selectors

Newspaperjs ⭐ 63

News extraction and scraping. Article Parsing

Stack Scraper ⭐ 61

OCaml functional web scraping library

A web scraping framework for .NET

Checks USD/PYG exchange rate from several sites, with a calculator, RESTful API and a twitter bot

Euro2016_terminalapp ⭐ 55

⚽ Instantly find 🏆EURO 2016 live-streams & highlights, now a Web App!

Selectorlib ⭐ 55

A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them

Scraping Helper Chrome Extension ⭐ 53

Scraping Helper will help you to find out the best html/css selector for certain elements

Newscoverageonwuhan ⭐ 51

Chinese News coverage on Wuhan during the epidemics outbreak

Html Table Extractor ⭐ 51

extract data from html table

Chegg Scraper ⭐ 50

Download Chegg homework-help questions to self-sufficient HTML files

Web Scraping Framework

Domain-specific language for extracting structured data from HTML documents

Simple PHP script for notifying for a free appointments on the Berlin services website.

Scraper Fourone Jobs ⭐ 43

This is a anti-scraping cracker for extracting apply information of one of Taiwan jobs recruiting website.

Yellowpages Scraper ⭐ 43

Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.

Rscraping Jsm 2016 ⭐ 42

Repository for one-day course "A Primer to Web Scraping with R"

Extract rich metadata from URLs

Searchscraperapi ⭐ 41

Aiohttp web server API, which scrapes Google and returns scrape results as response. Supports proxies, multiple geos and number of results.

Formless ⭐ 40

Completely transparent, unobtrusive form populator for web applications and content scrapers.

Uber_data ⭐ 40

Uber web interface crawler / scraper - Convert the trips table into a CSV file

Ronin Web ⭐ 40

ronin-web is a collection of useful web helper methods and commands.

Tvseries ⭐ 36

TV Series is a tool that scrapes Episode Synopsis' of popular TV Series' from websites like Wikipedia / IMDb and show in one place with a user-friendly navigation UI.

Linkebot ⭐ 36

🔎 um bot de Web Scraping para mostrar vagas do LinkedIn

Readability Cli ⭐ 35

A CLI for Mozilla Readability. Get clean, uncluttered, ready-to-read HTML from any webpage!

Docparser ⭐ 34

A ruby web/screen scraping tool / gem.

Scrape Metadata ⭐ 32

📜 HTML metadata scraper

Unofficial client to access your Itaú bank data

Wwwclient ⭐ 31

Advanced web browsing, scraping and automation

Interactive Facebook Reactions ⭐ 30

Jupyter notebook + Code for processing Facebook Reactions data and making Interactive Charts

Xpath Selector ⭐ 28

Library implementing easy XPath queries. Very useful for HTML and XML web scraping.

An R Scraper for Tiktok

Geoip Scraper ⭐ 27

Scrapes specified files, generating a pretty google powered map with geoip results

Kotlin DSL to scrape HTML and convert it to JSON

Related Searches

Javascript Html (52,781)

Html Css (19,526)

Python Html (6,892)

Html Jquery (5,656)

Html Bootstrap (5,651)

Php Html (5,615)

Html Theme (5,550)

Html Jekyll (5,387)

Typescript Html (5,136)

Html Markdown (5,082)

1-100 of 142 search results

Privacy | About | Terms | Follow Us On Twitter

Copyright 2018-2024 Awesome Open Source. All rights reserved.