Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Science Ipython Notebooks | 25,668 | 6 months ago | 34 | other | Python | |||||
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. | ||||||||||
Vaquarkhan | 1,464 | a year ago | ||||||||
Matano | 1,259 | 5 months ago | 53 | apache-2.0 | Rust | |||||
Open source security data lake for threat hunting, detection & response, and cybersecurity analytics at petabyte scale on AWS | ||||||||||
Hudi Resources | 509 | 3 months ago | ||||||||
汇总Apache Hudi相关资料 | ||||||||||
Arvados | 354 | 3 months ago | 9 | September 21, 2023 | 13 | other | Go | |||
An open source platform for managing and analyzing biomedical big data | ||||||||||
Cloudbreak | 348 | 3 months ago | 41 | apache-2.0 | Java | |||||
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features. | ||||||||||
Bigdata File Viewer | 269 | 6 months ago | 2 | gpl-2.0 | Java | |||||
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc. | ||||||||||
Parquet4s | 267 | 6 | 3 months ago | 57 | November 12, 2023 | 6 | mit | Scala | ||
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster. | ||||||||||
Amazon S3 Find And Forget | 223 | 3 months ago | 13 | apache-2.0 | Python | |||||
Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR) | ||||||||||
Aws Etl Orchestrator | 185 | 4 years ago | 1 | other | Python | |||||
A serverless architecture for orchestrating ETL jobs in arbitrarily-complex workflows using AWS Step Functions and AWS Lambda. |