Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for machine learning big data
big-data
x
machine-learning
x
125 search results found
Awesome Scalability
⭐
50,409
The Patterns of Scalable, Reliable, and Performant Large-Scale Systems
Data Science Ipython Notebooks
⭐
25,668
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Gun
⭐
17,626
An open source cybersecurity protocol for syncing decentralized graph data.
Vaex
⭐
8,161
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Catboost
⭐
7,564
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
H2o 3
⭐
6,618
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Feast
⭐
5,342
The Open Source Feature Store for Machine Learning
Vespa
⭐
5,115
AI + Data, online. https://vespa.ai
Synapseml
⭐
4,989
Simple and Distributed Machine Learning
Volcano
⭐
3,577
A Cloud Native Batch System (Project under CNCF)
Data Science Roadmap
⭐
2,445
Data Science Roadmap from A to Z
Root
⭐
2,329
The official repository for ROOT: analyzing, storing and visualizing big data, scientifically
Spark
⭐
1,963
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Byzer Lang
⭐
1,821
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
Spark Py Notebooks
⭐
1,515
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Optimus
⭐
1,447
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Scikit Learn Intelex
⭐
1,116
Intel(R) Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application
Datumbox Framework
⭐
1,089
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applications.
Kube Batch
⭐
1,065
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
Kube Batch
⭐
1,055
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
Daft
⭐
1,012
Distributed DataFrame for Python designed for the cloud, powered by Rust
Autodl
⭐
999
Automated Deep Learning without ANY human intervention. 1'st Solution for AutoDL challenge@NeurIPS.
Sparkling Water
⭐
957
Sparkling Water provides H2O functionality inside Spark cluster
Sciblog_support
⭐
742
Support content for my blog
Pgm Index
⭐
693
🏅State-of-the-art learned data structure that enables fast lookup, predecessor, range searches and updates in arrays of billions of items using orders of magnitude less space than traditional indexes
Data Science Career
⭐
661
Career Resources for Data Science, Machine Learning, Big Data and Business Analytics Career Repository
Sdc
⭐
645
Numba extension for compiling Pandas data frames, Intel® Scalable Dataframe Compiler
Courses
⭐
590
Answers for Quizzes & Assignments that I have taken
Eland
⭐
588
Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch
Onedal
⭐
584
oneAPI Data Analytics Library (oneDAL)
Bigartm
⭐
537
Fast topic modeling platform
Bigslice
⭐
525
A serverless cluster computing system for the Go programming language
Awesome Data Catalogs
⭐
441
📙 Awesome Data Catalogs and Observability Platforms.
Api.rss
⭐
345
RSS as RESTful. This service allows you to transform RSS feed into an awesome API.
100daysofmlcode
⭐
302
My journey to learn and grow in the domain of Machine Learning and Artificial Intelligence by performing the #100DaysofMLCode Challenge. Now supported by bright developers adding their learnings 👍
Flink Ml
⭐
270
Machine learning library of Apache Flink
Geni
⭐
268
A Clojure dataframe library that runs on Spark
Devops Roadmap
⭐
266
DevOps methodology & roadmap for a devops developer in 2019. Interesting books to learn new technologies.
Shifu
⭐
235
An end-to-end machine learning and data mining framework on Hadoop
Verticapy
⭐
210
VerticaPy is a Python library that exposes sci-kit like functionality to conduct data science projects on data stored in Vertica, thus taking advantage Vertica’s speed and built-in analytics and machine learning capabilities.
Sgdlibrary
⭐
181
MATLAB/Octave library for stochastic optimization algorithms: Version 1.0.20
Tipdm
⭐
178
TipDM建模平台,开源的数据挖掘工具。
Data Science Live Book
⭐
177
An open source book to learn data science, data analysis and machine learning, suitable for all ages!
Setl
⭐
177
A simple Spark-powered ETL framework that just works 🍺
Idp
⭐
165
IDP is an open source AI IDE for data scientists and big data engineers.
Datasciencevm
⭐
161
Tools and Docs on the Azure Data Science Virtual Machine (http://aka.ms/dsvm)
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Data Algorithms With Spark
⭐
151
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Tennis Crystal Ball
⭐
150
Ultimate Tennis Statistics and Tennis Crystal Ball - Tennis Big Data Analysis and Prediction
Bigdata
⭐
142
hadoop,hbase,storm,spark,etc..
Notebook
⭐
140
✍ 记录一路走来学习的计算机专业知识 ,力求构建 AI & CS & SE 知识体系
Sparkling Graph
⭐
134
SparklingGraph provides easy to use set of features that will give you ability to proces large scala graphs using Spark and GraphX.
Incubator Liminal
⭐
131
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Acousticbrainz Server
⭐
126
The server components for the AcousticBrainz project
Merlin
⭐
112
Machine Learning for HPC Workflows
Macro_ml
⭐
104
Course Website on Macroeconomic Analysis with Machine Learning and Big Data
Covid19 Sir
⭐
103
CovsirPhy: Python library for COVID-19 analysis with phase-dependent SIR-derived ODE models.
Vizuka
⭐
100
Explore high-dimensional datasets and how your algo handles specific regions.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Clgen
⭐
97
Deep learning program generator
Classifai
⭐
96
🔥 One of the most comprehensive open-source data annotation platform.
Sift
⭐
91
Knowledge extraction from web data
Bitcoin Value Predictor
⭐
90
[NOT MAINTAINED] Predicting Bit coin price using Time series analysis and sentiment analysis of tweets on bitcoin
Open Source Handbook
⭐
81
⭐️ Open source projects for all skill levels
Anovos
⭐
78
Anovos - An Open Source Library for Scalable feature engineering Using Apache-Spark
Books2rec
⭐
76
A recommender system built for book lovers.
Hyper Engine
⭐
69
Python library for Bayesian hyper-parameters optimization
Ineuron Full Stack Data Science Assignments
⭐
68
This Repository consists of Assignments and projects of the iNeuron Full Stack Data Science Course
Mmtf Pyspark
⭐
64
Methods for the parallel and distributed analysis and mining of the Protein Data Bank using MMTF and Apache Spark.
Meetups Archivos
⭐
62
Ppts, códigos y videos de las meetups, data science days, videollamadas y workshops. Data Science Research es una organización sin fines de lucro que busca difundir, descentralizar y difundir los conocimientos en Ciencia de Datos e Inteligencia Artificial en el Perú, dando oportunidades a nuevos talentos mediante MeetUps, Workshops y Semilleros de Conocimiento e Investigación.
Awesome Ai Kubernetes
⭐
62
❄️ 🐳 Awesome tools and libs for AI, Deep Learning, Machine Learning, Computer Vision, Data Science, Data Analytics and Cognitive Computing that are baked in the oven to be Native on Kubernetes and Docker with Python, R, Scala, Java, C#, Go, Julia, C++ etc
Xcast
⭐
62
A High-Performance Data Science Toolkit for the Earth Sciences
Datadocs
⭐
61
Documentation for data enthusiasts
Quantq
⭐
61
The repository for the Machine Learning and Big Data with kdb+/q book by Novotny et al.
Mmtf Workshop 2018
⭐
53
Structural Bioinformatics Training Workshop & Hackathon 2018
Cloud Bigdata Book
⭐
53
write book
Shelf
⭐
52
a Wide Shelf for AI and Data Science | Resources 🍔
Rsopt
⭐
51
Riemannian stochastic optimization algorithms: Version 1.0.3
Ia Z
⭐
47
Dépôt pour le cours d'IA par la communauté @DefendIntelligence.
Lime For Time
⭐
47
Application of the LIME algorithm by Marco Tulio Ribeiro, Sameer Singh, Carlos Guestrin to the domain of time series classification
R4ml
⭐
45
Scalable R for Machine Learning
Dislib
⭐
41
The Distributed Computing library for python implemented using PyCOMPSs programming model for HPC.
Subsemble
⭐
39
subsemble R package for ensemble learning on subsets of data
Aialgorithmsandapplications
⭐
39
Course Material: AI Algorithms and Applications with Python I + II
Books
⭐
39
整理一些书籍 ,包含 C&C++ 、git 、Java、Keras 、Linux 、NLP 、Python 、Scala 、TensorFlow 、大数据 、推荐系统、数据库、数据挖掘 、机器学习 、深度学习 、算法等。
Gendergaptracker
⭐
36
Scrape news articles and analyze them using NLP to quantify the gender gap in Canadian mainstream media
Sageworks
⭐
36
SageWorks: An easy to use Python API for creating and deploying SageMaker Models
Yaetos
⭐
32
Write data & AI pipelines in (SQL, Spark, Pandas) and deploy to the cloud, simplified
Ides
⭐
32
智能数据探索服务(Intelligent Data Exploration Service),一站式Data + AI数据解决方案!
Spark Mllib Tutorial
⭐
31
大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件
Artificial Intelligence Important Documents Collections
⭐
31
AI technology is significant because it allows software to do human functions—understanding, reasoning, planning, communication, and perception—increasingly effectively, efficiently, and affordably.
Vehicleorientationdataset
⭐
30
The vehicle orientation dataset is a large-scale dataset containing more than one million annotations for vehicle detection with simultaneous orientation classification using a standard object detection network.
Baskerville
⭐
30
Security Analytics Engine - Anomaly Detection in Web Traffic
Complete Data Science Roadmap
⭐
29
Complete Roadmap For Data Science
Appliedmathschoollectures
⭐
28
Lectures on "crime and political corruption analysis using data mining, machine learning and complex networks" at the School of Applied Mathematics in the Institute of Mathematics and Computer Science at University of São Paulo
Gdlibrary
⭐
27
Matlab library for gradient descent algorithms: Version 1.0.1
Thepersonalmsds
⭐
23
The Personal MS(DS) is an initiative to customize the Data Science Masters roadmap according to one's interests hence providing complete autonomy to the learner. The intuition behind #thepersonalmsds is to upgrade skills without formally enrolling into a Master's program at a University
Algorithmstar
⭐
23
as机器学习库支持各种度量系数的计算,同时具有knn,决策树,线性回归等机器学习基础算法计算组件的实 The AS machine learning library supports various machine learning algorithms, as well as SQL programming for data calculation and a powerful machine vision library, which can easily meet various data processing requirements.
Detecting Malicious Url Machine Learning
⭐
23
Insightedge
⭐
22
InsightEdge Core
Related Searches
Python Machine Learning (14,099)
Jupyter Notebook Machine Learning (12,247)
Machine Learning Neural Network (4,397)
Machine Learning Tensorflow (4,050)
Machine Learning Natural Language Processing (3,891)
Machine Learning Artificial Intelligence (3,877)
Machine Learning Data Science (3,802)
Machine Learning Pytorch (2,910)
Machine Learning Dataset (2,298)
Machine Learning Computer Vision (1,966)
1-100 of 125 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.