Warc Hadoop

WARC (Web Archive) Input and Output Formats for Hadoop
Alternatives To Warc Hadoop
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Warcbase154
7 years ago38Java
Warcbase is an open-source platform for managing analyzing web archives
Aut128
a year ago27November 17, 20223apache-2.0Scala
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Bifrost98
5 years ago4epl-1.0Clojure
Safely archive data from Apache Kafka to S3 with no Hadoop dependencies :)
Warc Hadoop3123110 years ago1May 10, 20144mitJava
WARC (Web Archive) Input and Output Formats for Hadoop
Graylog Plugin Output Webhdfs11
7 years agomitJava
WebHDFS Output plugin for Graylog
Ukwa Manage10
8 months ago54apache-2.0Jupyter Notebook
Shepherding our web archives from crawl to access.
Hawarp7
8 years ago1apache-2.0Arc
HAdoop-based Web Archive Record Processing
Archive7
8 years agoapache-2.0Java
An archive app based on CDH, providing upload and retrieval REST API
Tarfilesystem6
7 years ago5apache-2.0Java
The Tar FileSystem for Hadoop lives here
Alternatives To Warc Hadoop
Select To Compare


Alternative Project Comparisons
Popular Hadoop Projects
Popular Archive Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Java
Http
Archive
Hadoop