Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Spark | 37,661 | 2,394 | 939 | 5 months ago | 46 | May 09, 2021 | 186 | apache-2.0 | Scala | |
Apache Spark - A unified analytics engine for large-scale data processing | ||||||||||
Data Science Ipython Notebooks | 25,668 | 9 months ago | 34 | other | Python | |||||
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. | ||||||||||
Bigdata Notes | 14,872 | 6 months ago | 39 | Java | ||||||
大数据入门指南 :star: | ||||||||||
Deeplearning4j | 13,483 | 175 | 119 | 15 days ago | 54 | August 10, 2022 | 624 | apache-2.0 | Java | |
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation. | ||||||||||
Cookbook | 12,557 | 6 months ago | 111 | apache-2.0 | ||||||
The Data Engineering Cookbook | ||||||||||
Doris | 11,243 | 3 months ago | 8 | September 27, 2023 | 2,332 | apache-2.0 | Java | |||
Apache Doris is an easy-to-use, high performance and unified analytics database. | ||||||||||
It_book | 8,543 | 3 years ago | 7 | |||||||
本项目收藏这些年来看过或者听过的一些不错的常用的上千本书籍,没准你想找的书就在这里呢,包含了互联网行业大多数书籍和面试经验题目等等。有人工智能系列(常用深度学习框架TensorFlow、pytorch、keras。NLP、机器学习,深度学习等等),大数据系列(Spark,Hadoop,Scala,kafka等),程序员必修系列(C、C++、java、数据结构、linux,设计模式、数据库等等) | ||||||||||
God Of Bigdata | 8,483 | a year ago | 3 | |||||||
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive... | ||||||||||
H2o 3 | 6,618 | 62 | 33 | 5 months ago | 49 | August 09, 2023 | 2,746 | apache-2.0 | Jupyter Notebook | |
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc. | ||||||||||
Alluxio | 6,612 | 31 | 53 | 3 months ago | 73 | November 29, 2023 | 969 | apache-2.0 | Java | |
Alluxio, data orchestration for analytics and machine learning in the cloud |