Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Tiledb | 1,700 | 6 | 3 months ago | 87 | November 05, 2022 | 133 | mit | C++ | ||
The Universal Storage Engine | ||||||||||
Astro Sdk | 303 | 2 | 3 months ago | 49 | August 30, 2023 | 153 | apache-2.0 | Python | ||
Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow. | ||||||||||
Pysparkling | 253 | 7 | 1 | a year ago | 69 | November 13, 2022 | 9 | other | Python | |
A pure Python implementation of Apache Spark's RDD and DStream interfaces. | ||||||||||
Rumble | 194 | a year ago | 4 | December 03, 2019 | 134 | other | Java | |||
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more | ||||||||||
Sci Pype | 96 | 4 years ago | 3 | apache-2.0 | Python | |||||
A Machine Learning API with native redis caching and export + import using S3. Analyze entire datasets using an API for building, training, testing, analyzing, extracting, importing, and archiving. This repository can run from a docker container or from the repository. | ||||||||||
Decorators4ds | 27 | a year ago | mit | Python | ||||||
Useful decorators every Data Scientist should know | ||||||||||
Nodestream | 23 | 3 months ago | 25 | apache-2.0 | Python | |||||
A Fast, Declarative, and Extensible ETL Framework for Graph Databases. | ||||||||||
Red Panda | 18 | 2 years ago | n,ull | mit | Python | |||||
Easily interact with cloud (AWS) in your Data Science workflow. | ||||||||||
Tessellate | 5 | 5 months ago | 5 | other | Java | |||||
A data engineering cli for reading and writing data to/from multiple locations across multiple formats. |