Awesome Open Source
Awesome Open Source
Combined Topics
datascience
x
Advertising
📦 10
All Projects
Application Programming Interfaces
📦 124
Applications
📦 192
Artificial Intelligence
📦 78
Blockchain
📦 73
Build Tools
📦 113
Cloud Computing
📦 80
Code Quality
📦 28
Collaboration
📦 32
Command Line Interface
📦 49
Community
📦 83
Companies
📦 60
Compilers
📦 63
Computer Science
📦 80
Configuration Management
📦 42
Content Management
📦 175
Control Flow
📦 213
Data Formats
📦 78
Data Processing
📦 276
Data Storage
📦 135
Economics
📦 64
Frameworks
📦 215
Games
📦 129
Graphics
📦 110
Hardware
📦 152
Integrated Development Environments
📦 49
Learning Resources
📦 166
Legal
📦 29
Libraries
📦 129
Lists Of Projects
📦 22
Machine Learning
📦 347
Mapping
📦 64
Marketing
📦 15
Mathematics
📦 55
Media
📦 239
Messaging
📦 98
Networking
📦 315
Operating Systems
📦 89
Operations
📦 121
Package Managers
📦 55
Programming Languages
📦 245
Runtime Environments
📦 100
Science
📦 42
Security
📦 396
Social Media
📦 27
Software Architecture
📦 72
Software Development
📦 72
Software Performance
📦 58
Software Quality
📦 133
Text Editors
📦 49
Text Processing
📦 136
User Interface
📦 330
User Interface Components
📦 514
Version Control
📦 30
Virtualization
📦 71
Web Browsers
📦 42
Web Servers
📦 26
Web User Interface
📦 210
The Top 63 Datascience Open Source Projects
Categories
>
Data Processing
>
Datascience
Virgilio
⭐
12,939
Your new Mentor for Data Science E-Learning.
Ds Cheatsheets
⭐
8,350
List of Data Science Cheatsheets to rule the world
Industry Machine Learning
⭐
5,793
A curated list of applied machine learning and data science notebooks and libraries across different industries (by @firmai)
Modin
⭐
5,651
Modin: Speed up your Pandas workflows by changing a single line of code
Datascience
⭐
2,508
Curated list of Python resources for data science.
Pyfunctional
⭐
1,772
Python library for creating data pipelines with chain functional programming
Datasciencer
⭐
1,633
a curated list of R tutorials for Data Science, NLP and Machine Learning
Pbpython
⭐
1,508
Code, Notebooks and Examples from Practical Business Python
An Introduction To Statistical Learning
⭐
1,384
This repository contains the exercises and its solution contained in the book "An Introduction to Statistical Learning" in python.
Ggstatsplot
⭐
1,050
Enhancing `ggplot2` plots with statistical analysis 📊🎨📣
Skater
⭐
968
Python Library for Model Interpretation/Explanations
Clevercsv
⭐
876
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Datastream.io
⭐
811
An open-source framework for real-time anomaly detection using Python, ElasticSearch and Kibana
Numerical Computing Is Fun
⭐
729
Learning numerical computing with notebooks for all ages.
Vegas
⭐
706
The missing MatPlotLib for Scala + Spark
Ai Series
⭐
699
📚 [.md & .ipynb] Series of Artificial Intelligence & Deep Learning, including Mathematics Fundamentals, Python Practices, NLP Application, etc. 💫 人工智能与深度学习实战,数理统计篇 | 机器学习篇 | 深度学习篇 | 自然语言处理篇 | 工具实践 Scikit & Tensoflow & PyTorch 篇 | 行业应用 & 课程笔记
Business Machine Learning
⭐
567
A curated list of practical business machine learning (BML) and business data science (BDS) applications for Accounting, Customer, Employee, Legal, Management and Operations (by @firmai)
Jupyter Notify
⭐
487
A Jupyter Notebook magic for browser notifications of cell completion
Socios Brasil
⭐
435
Captura os dados de sócios das empresas brasileiras na Receita Federal e exporta para um formato legível por humanos
Or Pandas
⭐
425
【运筹OR帷幄|数据科学】pandas教程系列电子书
Krangl
⭐
414
krangl is a {K}otlin DSL for data w{rangl}ing
Dataframe Js
⭐
366
A javascript library providing a new data structure for datascientists and developpers
For Data Science Beginners
⭐
311
Set of 📝 with 🔗 to help those who are Data Science beginners 🤖
Notebooks Statistics And Machinelearning
⭐
269
Jupyter Notebooks from the old UnsupervisedLearning.com (RIP) machine learning and statistics blog
Oie Resources
⭐
263
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
Introduction Datascience Python Book
⭐
259
Introduction to Data Science: A Python Approach to Concepts, Techniques and Applications
Dgfraud
⭐
251
A Deep Graph-based Toolbox for Fraud Detection
Code
⭐
251
Compilation of R and Python programming codes on the Data Professor YouTube channel.
Salarios Magistrados
⭐
248
Baixa as planilhas de salários de magistrados, extrai os contracheques, limpa e exporta pra CSV
Datacamp Python Data Science Track
⭐
225
All the slides, accompanying code and exercises all stored in this repo. 🎈
My Awesome Ai Bookmarks
⭐
220
Curated list of my reads, implementations and core concepts of Artificial Intelligence, Deep Learning, Machine Learning by best folk in the world.
Melusine
⭐
215
Melusine is a high-level library for emails classification and feature extraction "dédiée aux courriels français".
Morpheus Core
⭐
199
The foundational library of the Morpheus data science framework
Tech.ml.dataset
⭐
192
A Clojure high performance data processing system
Ocaml Jupyter
⭐
172
An OCaml kernel for Jupyter (IPython) notebook
100 Days Of Ml Code
⭐
169
A day to day plan for this challenge. Covers both theoritical and practical aspects
Emotion Classification From Audio Files
⭐
169
Understanding emotions from audio files using neural networks and multiple datasets.
Data Science Resources
⭐
167
👨🏽🏫You can learn about what data science is and why it's important in today's modern world. Are you interested in data science?🔋
Climate Change Data
⭐
166
🌍 A curated list of APIs, open data and ML/AI projects on climate change
Around Dataengineering
⭐
163
A Data Engineering & Machine Learning Knowledge Hub
Boostaroota
⭐
159
A fast xgboost feature selection algorithm
Wikipedia Mirror
⭐
158
🌐 Guide and tools to run a full offline mirror of Wikipedia.org with three different approaches: Nginx caching proxy, Kimix + ZIM dump, and MediaWiki/XOWA + XML dump
Oreilly Intro To Predictive Clv
⭐
153
Repo that contains the supporting material for O'Reilly Webinar "An Intro to Predictive Modeling for Customer Lifetime Value" on Feb 28, 2017
Anaconda Project
⭐
146
Tool for encapsulating, running, and reproducing data science projects
Vscode Jupyter
⭐
144
VS Code Jupyter extension
Renku
⭐
138
The Renku Project provides a platform and tools for reproducible and collaborative data analysis.
Blockchain2graph
⭐
134
Blockchain2graph extracts blockchain data (bitcoin) and insert them into a graph database (neo4j).
Kravis
⭐
133
A {K}otlin g{ra}mmar for data {vis}ualization
Awesome Shiny Apps For Statistics
⭐
124
🌟 A curated list of Awesome Shiny Apps for Statistics (ASAS)🌟
Data_science_blogs
⭐
123
A repository to keep track of all the code that I end up writing for my blog posts.
Openuba
⭐
112
A robust, and flexible open source User & Entity Behavior Analytics (UEBA) framework used for Security Analytics. Developed with luv by Data Scientists & Security Analysts from the Cyber Security Industry. [PRE-ALPHA]
The Data Science Workshop
⭐
110
A New, Interactive Approach to Learning Data Science
Deep Ml Meetups
⭐
108
A central repository for all my projects
Machine_learning_a Z
⭐
100
Learning to create Machine Learning Algorithms
Covid19 Dashboard
⭐
87
🦠 Django + Plotly Coronavirus dashboard. Powerful data driven Python web-app, with an awesome UI. Contributions welcomed! Found on 🕶Awesome-list
Repo2docker Action
⭐
83
GitHub Action for repo2docker
Data Umbrella Scikit Learn Sprint
⭐
71
Jun 2020 scikit-learn sprint
Knyfe
⭐
54
knyfe is a python utility for rapid exploration of datasets.
R Community Explorer
⭐
40
Data-Driven Exploration of the R Community
Commons
⭐
34
⛲️ Commons Marketplace client & server to explore, download, and publish open data sets in the Ocean Protocol Network.
Kubeflow Data Science On Steroids
⭐
25
The blog post about Kubeflow, including all materials
Ditras
⭐
17
DITRAS (DIary-based TRAjectory Simulator), a mathematical model to simulate human mobility
Talks
⭐
16
Repository of publicly available talks by Leon Eyrich Jessen, PhD. Talks cover Data Science and R in the context of research
1-63 of 63 projects
Advertising
📦 10
All Projects
Application Programming Interfaces
📦 124
Applications
📦 192
Artificial Intelligence
📦 78
Blockchain
📦 73
Build Tools
📦 113
Cloud Computing
📦 80
Code Quality
📦 28
Collaboration
📦 32
Command Line Interface
📦 49
Community
📦 83
Companies
📦 60
Compilers
📦 63
Computer Science
📦 80
Configuration Management
📦 42
Content Management
📦 175
Control Flow
📦 213
Data Formats
📦 78
Data Processing
📦 276
Data Storage
📦 135
Economics
📦 64
Frameworks
📦 215
Games
📦 129
Graphics
📦 110
Hardware
📦 152
Integrated Development Environments
📦 49
Learning Resources
📦 166
Legal
📦 29
Libraries
📦 129
Lists Of Projects
📦 22
Machine Learning
📦 347
Mapping
📦 64
Marketing
📦 15
Mathematics
📦 55
Media
📦 239
Messaging
📦 98
Networking
📦 315
Operating Systems
📦 89
Operations
📦 121
Package Managers
📦 55
Programming Languages
📦 245
Runtime Environments
📦 100
Science
📦 42
Security
📦 396
Social Media
📦 27
Software Architecture
📦 72
Software Development
📦 72
Software Performance
📦 58
Software Quality
📦 133
Text Editors
📦 49
Text Processing
📦 136
User Interface
📦 330
User Interface Components
📦 514
Version Control
📦 30
Virtualization
📦 71
Web Browsers
📦 42
Web Servers
📦 26
Web User Interface
📦 210