Webarchive Indexing

Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Alternatives To Webarchive Indexing
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Opensoc Streaming123
4 years ago8apache-2.0Java
Extensible set of Storm topologies and topology attributes for streaming, enriching, indexing, and storing telemetry in Hadoop.
Druid Spark Batch89
5 years ago21apache-2.0Scala
Druid indexing plugin for using Spark in batch jobs
Webarchive Indexing30
6 years ago5mitPython
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Elephant Twin Lzo14
12 years ago1gpl-3.0Java
Elephant Twin LZO uses Elephant Twin to create LZO block indexes
St Hadoop13
5 years agootherJava
ST-Hadoop is an open-source MapReduce extension of Hadoop designed specially to analyze your spatio-temporal data efficiently
Hiddenattributemodels13
3 years ago2Python
HAM
Dann Core10
9 years agoJava
Hivesp8
10 years ago1apache-2.0Java
Hive with Spatial Query Support
Python Lzo Indexer7
10 years ago1August 06, 2014apache-2.0Python
Python library for indexing block offsets within LZO compressed files
Chatnoir2 Indexer6
2 years agomitJava
ChatNoir Indexer
Alternatives To Webarchive Indexing
Select To Compare


Alternative Project Comparisons
Popular Hadoop Projects
Popular Indexing Projects
Popular Data Processing Categories

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Python
Hadoop
Arc
Indexing
Mapreduce