Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Readability | 6,963 | 75 | 3 months ago | 6 | March 31, 2023 | 220 | other | JavaScript | ||
A standalone version of the readability lib | ||||||||||
Skrape.it | 714 | 3 | 4 months ago | 14 | July 19, 2022 | 25 | mit | Kotlin | ||
A Kotlin-based testing/scraping/parsing library providing the ability to analyze and extract data from HTML (server & client-side rendered). It places particular emphasis on ease of use and a high level of readability by providing an intuitive DSL. It aims to be a testing lib, but can also be used to scrape websites in a convenient fashion. | ||||||||||
Crux | 220 | 9 months ago | 35 | July 16, 2023 | 10 | apache-2.0 | Kotlin | |||
Crux offers a flexible plugin-based API & implementation to extract interesting information from Web pages. | ||||||||||
Calque | 66 | 2 | 4 years ago | 13 | March 09, 2019 | 3 | mit | JavaScript | ||
📑 Bringing the power and functionality of JavaScript, but with the readability of HTML. | ||||||||||
Go Domdistiller | 38 | a year ago | 2 | November 02, 2020 | mit | Go | ||||
Go-DomDistiller is a Go port of the DOM Distiller library which implements Reader mode in Chrome for Android and Desktop. It has no dependencies on Chromium and is meant to run as a command line program or on a server. | ||||||||||
Webmine | 25 | 13 years ago | Clojure | |||||||
Web Mining Toolkit for Clojure | ||||||||||
Dom Snapshot | 14 | 2 | 3 years ago | 23 | March 09, 2021 | mit | JavaScript | |||
Get `<canvas>` from DOM string through SVG `<foreignObject>`. | ||||||||||
Python Readable | 12 | 13 years ago | apache-2.0 | JavaScript | ||||||
Python port of Arc90's Readability content extraction rules | ||||||||||
Seize | 9 | 1 | 7 years ago | 8 | June 06, 2016 | HTML | ||||
Seize is light Node or Browser web-page content extractor inspired by arc90 readability and Safari Reader |