Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for machine learning hadoop
hadoop
x
machine-learning
x
37 search results found
Data Science Ipython Notebooks
⭐
25,668
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Xgboost
⭐
25,253
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
H2o 3
⭐
6,618
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Xlearning
⭐
1,729
AI on Hadoop
Tony
⭐
697
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Dist Keras
⭐
611
Distributed Deep Learning, with a focus on distributed training, using Keras and Apache Spark.
Venice
⭐
402
Venice, Derived Data Platform for Planet-Scale Workloads.
Ytk Learn
⭐
351
Ytk-learn is a distributed machine learning library which implements most of popular machine learning algorithms(GBDT, GBRT, Mixture Logistic Regression, Gradient Boosting Soft Tree, Factorization Machines, Field-aware Factorization Machines, Logistic Regression, Softmax).
Cascading
⭐
321
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.
Sagemaker Spark
⭐
285
A Spark library for Amazon SageMaker.
Shifu
⭐
235
An end-to-end machine learning and data mining framework on Hadoop
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Bigdata
⭐
142
hadoop,hbase,storm,spark,etc..
Xlearning Xdml
⭐
101
extremely distributed machine learning
Ros_hadoop
⭐
98
Hadoop splittable InputFormat for ROS. Process rosbag with Hadoop Spark and other HDFS compatible systems.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Resilient Ml Research Platform
⭐
76
Guagua
⭐
72
An iterative computing framework for both Hadoop MapReduce and Hadoop YARN.
Bigdataanalytics_infoh515
⭐
56
Material for the Big Data Analytics exercise classes - INFOH515 - Big Data : Distributed Data Management and Scalable Analytics - Université Libre de Bruxelles
Mlhadoop
⭐
53
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Movie Recommender Demo
⭐
50
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
Mastering Scala Machine Learning
⭐
32
Mastering-Scala-Machine-Learning
Uba
⭐
22
UEBA Solution for Insider Security. This repo is archived. Thanks!
Strata 2016
⭐
20
This repo is for ML/GraphX tutorial in Strata 2016
Mmtf Spark
⭐
19
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Pythonsparkmlbookclub
⭐
19
Interview Questions Collection
⭐
19
按知识领域整理面试题,包括C++、Java、Hadoop、机器学习等
Data Science Ebooks
⭐
19
Data Science E-books, Interview Resources and Cheat-sheets
Azure Camp Dec
⭐
18
This is Dec. 2016 - Azure Camp code & hands on purpose repository
Data Pipeline Project
⭐
18
Data pipeline project
Fastml4j
⭐
16
Fast Scala and nd4j based machine learning framework
Interview Notes
⭐
15
有关Python、大数据、MySQL的总结
Cheatsheets For Ai
⭐
14
Cheatsheets on numerous topics ranging from DataScience | ML | DL | AI | Big Data.
Recommendbyitemcf
⭐
11
Hadoop mapreduce. 基于ItemCF的协同过滤 物品推荐系统 Collaborative filtering goods recommendation system based on ItemCF
Big_data_course_rimini_2021
⭐
11
Questa repository contiene tutto il materiale didattico utilizzato durante il corso di "Laboratorio Big Data" in collaborazione con il comune di Rimini.
Azure Camp Sep
⭐
10
This is September Azure Camp code & hands on purpose repository
Big Data Framework Demos
⭐
9
It is a assemble to include all Practice Projects about Big Data Topic, includes Hadoop, Spark, Spark Streaming and Kafka
Data Science Related
⭐
8
Welcome to this soon to be comprehensive repository dedicated to Data Science, Machine Learning, and Big Data Management.
Tensorflow Python3 Jupyter
⭐
8
tensorflow-python3-jupyter
Coursera_bigdata_ucsd
⭐
8
UCSD Big Data Specialization General Materials and my Capstone Project.
Tech Faqs
⭐
8
Easy introductions to few important, simple, tech topics.
Azure Camp Mar
⭐
7
This is March Azure Camp code & hands on purpose repository
Azure Camp Jun
⭐
7
This is June Azure Camp code & hands on purpose repository
Anomaly Detection Log Datasets
⭐
7
Analysis scripts for log data sets used in anomaly detection.
Quickref
⭐
7
Quick references to notes on specific topics and their basic introductions
Largescaleml
⭐
7
Large Scale ML library for recommender systems
Data Engineer Portfolio
⭐
6
This is a repository to demonstrate my details, skills, projects and to keep track of my progression in Data Analytics and Data Science topics.
Time Series Analysis Nyc Taxi
⭐
6
⏰ 📓 Time series analysis of new york taxi data
Ethz Web Scale Data Mining Project
⭐
6
ETH Zurich - Web Scale Data Processing and Mining Project
0_to_ml_engineer
⭐
5
Im teaching myself how to do machine learning via the internet and storing materials here.
Related Searches
Python Machine Learning (14,099)
Jupyter Notebook Machine Learning (12,247)
Machine Learning Neural Network (4,397)
Machine Learning Tensorflow (4,050)
Machine Learning Natural Language Processing (3,891)
Machine Learning Artificial Intelligence (3,877)
Machine Learning Data Science (3,802)
Machine Learning Pytorch (2,910)
Machine Learning Dataset (2,298)
Machine Learning Classification (2,243)
1-37 of 37 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.