Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Devops Python Tools | 709 | 4 months ago | 37 | mit | Python | |||||
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc. | ||||||||||
Divolte Collector | 275 | 3 years ago | 63 | apache-2.0 | Java | |||||
Divolte Collector | ||||||||||
Storagetapper | 269 | 2 years ago | 4 | November 19, 2021 | 21 | mit | Go | |||
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service | ||||||||||
Bigdata File Viewer | 269 | 7 months ago | 2 | gpl-2.0 | Java | |||||
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc. | ||||||||||
Hdfs | 257 | 7 | 7 months ago | 15 | December 13, 2022 | 20 | mit | Python | ||
API and command line interface for HDFS | ||||||||||
Rumble | 194 | a year ago | 4 | December 03, 2019 | 134 | other | Java | |||
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more | ||||||||||
Camus | 87 | a year ago | 6 | apache-2.0 | Java | |||||
Mirror of Linkedin's Camus | ||||||||||
Spark Compaction | 52 | 5 years ago | 3 | apache-2.0 | Java | |||||
File compaction tool that runs on top of the Spark framework. | ||||||||||
Etl Light | 38 | 7 years ago | mit | Scala | ||||||
A light Kafka to HDFS/S3 ETL library based on Apache Spark | ||||||||||
Arvo2parquet | 30 | 5 years ago | 2 | mit | Java | |||||
Example program that writes Parquet formatted data to plain files (i.e., not Hadoop hdfs); Parquet is a columnar storage format. |