Learning Hadoop And Spark Alternatives

Name: lynnlangit/learning-hadoop-and-spark
Brand: lynnlangit/learning-hadoop-and-spark
SKU: project/lynnlangit/learning-hadoop-and-spark
Rating: 4.48 (160 reviews)

Companion to Learning Hadoop and Learning Spark courses on Linked In Learning

Categories > Data Processing > Amazon Web Services

Suggest Alternative

Stars

160

Alternatives

License

apache-2.0

Open Issues

Most Recent Commit

over 2 years ago

Programming Language

HTML

Dependent Repos

Dependent Packages

Total Releases

Categories

Web User Interface > Html

Cloud Computing > Amazon Web Services

Cloud Computing > Azure

Data Processing > Spark

Cloud Computing > Google Cloud Platform

Data Processing > Hadoop

Data Processing > Mapreduce

Data Processing > Apache Spark

Site

Repo

Alternatives To lynnlangit/learning-hadoop-and-spark

Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
apache/spark	37,661	2,394	939	over 2 years ago	46	May 09, 2021	186	apache-2.0	Scala
Apache Spark - A unified analytics engine for large-scale data processing
donnemartin/data-science-ipython-notebooks	25,668	0	0	almost 3 years ago	0		34	other	Python
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
heibaiying/BigData-Notes	14,872	0	0	over 2 years ago	0		39		Java
大数据入门指南 :star:
deeplearning4j/deeplearning4j	14,235	175	119	about 1 month ago	54	August 10, 2022	624	apache-2.0	Java
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...
andkret/Cookbook	12,557	0	0	over 2 years ago	0		111	apache-2.0
The Data Engineering Cookbook
apache/doris	10,666	0	0	over 2 years ago	8	September 27, 2023	2,332	apache-2.0	Java
Apache Doris is an easy-to-use, high performance and unified analytics database.
XiangLinPro/IT_book	8,543	0	0	over 4 years ago	0		7
本项目收藏这些年来看过或者听过的一些不错的常用的上千本书籍，没准你想找的书就在这里呢，包含了互联网行业大多数书籍和面试经验题目等等。有人工智能系列（常用深度学习框架TensorFlow、pytorch、keras。NLP、机器学习，深度学习等等），大数据系列(Spark,Hadoop,Scala,kafka等)，程序员必修系列（C、C++、java、数据结构、linux，设计模式、数据库等等）
wangzhiwubigdata/God-Of-BigData	8,483	0	0	almost 3 years ago	0		3
专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
h2oai/h2o-3	7,485	62	33	3 months ago	49	August 09, 2023	2,746	apache-2.0	Jupyter Notebook
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Alluxio/alluxio	6,544	31	53	over 2 years ago	73	November 29, 2023	969	apache-2.0	Java
Alluxio, data orchestration for analytics and machine learning in the cloud

Alternatives To lynnlangit/learning-hadoop-and-spark

Select To Compare

apache/spark ⭐ 37,661

Apache Spark - A unified analytics engine for large-scale data processing

dependent packages 939 total releases 46 most recent commit over 2 years ago

donnemartin/data-science-ipython-notebooks ⭐ 25,668

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

dependent packages 0 total releases 0 most recent commit almost 3 years ago

heibaiying/BigData-Notes ⭐ 14,872

大数据入门指南 :star:

dependent packages 0 total releases 0 most recent commit over 2 years ago

deeplearning4j/deeplearning4j ⭐ 14,235

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...

dependent packages 119 total releases 54 most recent commit about 1 month ago

andkret/Cookbook ⭐ 12,557

The Data Engineering Cookbook

dependent packages 0 total releases 0 most recent commit over 2 years ago

apache/doris ⭐ 10,666

Apache Doris is an easy-to-use, high performance and unified analytics database.

dependent packages 0 total releases 8 most recent commit over 2 years ago downloads badge

XiangLinPro/IT_book ⭐ 8,543

本项目收藏这些年来看过或者听过的一些不错的常用的上千本书籍，没准你想找的书就在这里呢，包含了互联网行业大多数书籍和面试经验题目等等。有人工智能系列（常用深度学习框架TensorFlow、pytorch、keras。NLP、机器学习，深度学习等等），大数据系列(Spark,Hadoop,Scala,kafka等)，程序员必修系列（C、C++、java、数据结构、linux，设计模式、数据库等等）

dependent packages 0 total releases 0 most recent commit over 4 years ago

wangzhiwubigdata/God-Of-BigData ⭐ 8,483

专注大数据学习面试，大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

dependent packages 0 total releases 0 most recent commit almost 3 years ago

h2oai/h2o-3 ⭐ 7,485

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

dependent packages 33 total releases 49 most recent commit 3 months ago

Alluxio/alluxio ⭐ 6,544

Alluxio, data orchestration for analytics and machine learning in the cloud

dependent packages 53 total releases 73 most recent commit over 2 years ago

Suggest An Alternative To learning-hadoop-and-spark

Alternative Project Comparisons

lynnlangit/learning-hadoop-and-spark vs Spark

lynnlangit/learning-hadoop-and-spark vs Data Science Ipython Notebooks

lynnlangit/learning-hadoop-and-spark vs Bigdata Notes

lynnlangit/learning-hadoop-and-spark vs Deeplearning4j

lynnlangit/learning-hadoop-and-spark vs Cookbook

lynnlangit/learning-hadoop-and-spark vs Doris

lynnlangit/learning-hadoop-and-spark vs It_book

lynnlangit/learning-hadoop-and-spark vs God Of Bigdata

lynnlangit/learning-hadoop-and-spark vs H2o 3

lynnlangit/learning-hadoop-and-spark vs Alluxio

Popular Spark Projects

getredash/redash⭐ 24,479

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

yeasy/docker_practice⭐ 23,279

Learn and understand Docker&Container technologies, with real DevOps practice!

DataTalksClub/data-engineering-zoomcamp⭐ 19,461

Free Data Engineering course!

zhisheng17/flink-learning⭐ 13,801

flink learning blog. http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例，还有 Flink 落地应用的大型项目案例（PVUV、日志存储、百亿数据实时去重、监控告警）分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

horovod/horovod⭐ 13,755

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

Popular Hadoop Projects

dmlc/xgboost⭐ 25,253

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

spotify/luigi⭐ 17,046

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Tencent/APIJSON⭐ 16,277

🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码，前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.

trinodb/trino⭐ 9,118

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

linkedin/school-of-sre⭐ 8,103

At LinkedIn, we are using this curriculum for onboarding our entry-level talents into the SRE role.

Popular Data Processing Categories

Jupyter Notebook

Dataset

Sql

Validation

Pipeline

Translation

Data Science

Classification

Transaction

Scraper