Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for data mining
data-mining
x
926 search results found
Ml From Scratch
⭐
23,095
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
Awesome Datascience
⭐
23,007
📝 An awesome Data Science repository to learn and apply for real world problems.
Easyocr
⭐
20,438
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
Lightgbm
⭐
16,056
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
Awesome Production Machine Learning
⭐
15,804
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Gensim
⭐
15,180
Topic Modelling for Humans
Python Machine Learning Book
⭐
11,645
The "Python Machine Learning (1st edition)" book code repository and info resource
Openrefine
⭐
10,106
OpenRefine is a free, open source power tool for working with messy data and improving it
Ai Learn
⭐
8,256
人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Py tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域
Pyod
⭐
7,751
A Comprehensive and Scalable Python Library for Outlier Detection (Anomaly Detection)
Anomaly Detection Resources
⭐
7,616
Anomaly detection related books, papers, videos, and toolboxes
Catboost
⭐
7,564
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.
Sktime
⭐
7,405
A unified framework for machine learning with time series
Awesome Ml For Cybersecurity
⭐
6,564
:octocat: Machine Learning for Cyber Security
Ferret
⭐
5,540
Declarative web scraping
Mlxtend
⭐
4,669
A library of extension and helper modules for Python's data analysis and machine learning libraries.
Orange3
⭐
4,469
🍊 📊 💡 Orange: Interactive data analysis
Datascience
⭐
3,955
Curated list of Python resources for data science.
Rath
⭐
3,717
Next generation of automated data exploratory analysis and visualization platform.
Textract
⭐
3,699
extract text from any document. no muss. no fuss.
Kaggle Solutions
⭐
3,579
🏅 Collection of Kaggle Solutions and Ideas 🏅
Alink
⭐
3,479
Alink is the Machine Learning algorithm platform based on Flink, developed by the PAI team of Alibaba computing platform.
Machinelearning
⭐
3,016
Machine learning resources
Webplotdigitizer
⭐
2,375
Online tool to extract numerical data from plot images.
Awesome Ts Anomaly Detection
⭐
2,320
List of tools & datasets for anomaly detection on time-series data.
Bolt
⭐
2,312
10x faster matrix and vector operations
Graphic Walker
⭐
2,077
An open source alternative to Tableau. Easily embedded in any web apps.
Pdftabextract
⭐
1,994
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
Papers Literature Ml Dl Rl Ai
⭐
1,798
Highly cited and useful papers related to machine learning, deep learning, AI, game theory, reinforcement learning
Discord Datamining
⭐
1,631
Datamining Discord changes from the JS files
Ai For Security Learning
⭐
1,571
安全场景、基于AI的安全算法和安全数据分析业界实践
Invoice2data
⭐
1,570
Extract structured data from PDF invoices
Research
⭐
1,550
novel deep learning research works with PaddlePaddle
Pycm
⭐
1,413
Multi-class confusion matrix library in Python
Awesome Fraud Detection Papers
⭐
1,364
A curated list of data mining papers about fraud detection.
Ail Framework
⭐
1,273
AIL framework - Analysis Information Leak framework. Project moved to https://github.com/ail-project
Tsv Utils
⭐
1,236
eBay's TSV Utilities: Command line tools for large, tabular data files. Filtering, statistics, sampling, joins and more.
Dex
⭐
1,193
Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.
Vvedenie Mashinnoe Obuchenie
⭐
1,187
📝 Подборка ресурсов по машинному обучению
Clevercsv
⭐
1,168
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.
Graph Fraud Detection Papers
⭐
1,148
A curated list of graph-based fraud, anomaly, and outlier detection papers & resources
Awesome Fl
⭐
1,103
Comprehensive and timely academic information on federated learning (papers, frameworks, datasets, tutorials, workshops)
Awesome Ai Books
⭐
1,086
Some awesome AI related books and pdfs for learning and downloading, also apply some playground models for learning
Nfstream
⭐
1,043
NFStream: a Flexible Network Data Analysis Framework.
Team Learning Data Mining
⭐
1,001
主要存储Datawhale组队学习中“数据挖掘/机器学习”方向的资料。
Astroml
⭐
984
Machine learning, statistics, and data mining for astronomy and astrophysics
Deep_gcns_torch
⭐
940
Pytorch Repo for DeepGCNs (ICCV'2019 Oral, TPAMI'2021), DeeperGCN (arXiv'2020) and GNN1000(ICML'2021): https://www.deepgcns.org
Pyclustering
⭐
853
pyclustring is a Python, C++ data mining library.
Dataflowjavasdk
⭐
853
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Pyhealth
⭐
825
A Deep Learning Python Toolkit for Healthcare Applications.
Feature Engineering And Feature Selection
⭐
798
A Guide for Feature Engineering and Feature Selection, with implementations and examples in Python.
My Tensorflow Tutorials
⭐
794
This repo contains all of my TensorFlow tutorials
Stocktalk
⭐
773
Data collection tool for social media analytics
Cookbook 2nd
⭐
773
IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018
Graph Adversarial Learning Literature
⭐
772
A curated list of adversarial attacks and defenses papers on graph-structured data.
Elki
⭐
746
ELKI Data Mining Toolkit
R
⭐
745
Collection of various algorithms implemented in R.
Aeon
⭐
723
A toolkit for conducting machine learning tasks with time series data
Dataproofer
⭐
681
A proofreader for your data
Unitypy
⭐
665
UnityPy is python module that makes it possible to extract/unpack and edit Unity assets
Data Science With Ruby
⭐
664
Practical Data Science with Ruby based tools.
Interpretable_machine_learning_with_python
⭐
629
Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.
Awesome Ai For Time Series Papers
⭐
627
A professional list of Papers, Tutorials, and Surveys on AI for Time Series in top AI conferences and journals.
Awesome Deep Graph Clustering
⭐
626
Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods (papers, codes, and datasets).
Pm4py Core
⭐
617
Public repository for the PM4Py (Process Mining for Python) project.
Adbench
⭐
609
Official Implement of "ADBench: Anomaly Detection Benchmark".
Combo
⭐
607
(AAAI' 20) A Python Toolbox for Machine Learning Model Combination
Kam1n0 Community
⭐
601
The Kam1n0 Assembly Analysis Platform
Timetk
⭐
594
Time series analysis in the `tidyverse`
Game Datasets
⭐
584
🎮 A curated list of awesome game datasets, and tools to artificial intelligence in games
Python Twitter Examples
⭐
570
Examples of using Python for Twitter social data mining, using the python-twitter-tools framework.
Krangl
⭐
559
krangl is a {K}otlin DSL for data w{rangl}ing
Pypots
⭐
558
A Python toolbox/library for reality-centric machine learning/deep learning on partially-observed time series with PyTorch, including SOTA models supporting tasks of imputation, classification, clustering, and forecasting on incomplete (irregularly-sampled) multivariate time series with NaN missing values/data. https://arxiv.org/abs/2305.18811
Instascrape
⭐
554
Powerful and flexible Instagram scraping library for Python, providing easy-to-use and expressive tools for accessing data programmatically
Cookbook 2nd Code
⭐
532
Code of the IPython Cookbook, Second Edition, by Cyrille Rossant, Packt Publishing 2018 [read-only repository]
Text_mining_resources
⭐
511
Resources for learning about Text Mining and Natural Language Processing
Jekyll
⭐
498
Jekyll-based static site for The Programming Historian
Osintbuddy
⭐
498
Node graphs, OSINT data mining, and plugins. Connect unstructured and public data for transformative insights
Amazing Feature Engineering
⭐
485
Feature engineering is the process of using domain knowledge to extract features from raw data via data mining techniques. These features can be used to improve the performance of machine learning algorithms. Feature engineering can be considered as applied machine learning itself.
Hearthbreaker
⭐
474
A Hearthstone: Heroes of WarCraft Simulator for the purposes of Machine Learning and Data Mining
Cogcomp Nlp
⭐
448
CogComp's Natural Language Processing Libraries and Demos: Modules include lemmatizer, ner, pos, prep-srl, quantifier, question type, relation-extraction, similarity, temporal normalizer, tokenizer, transliteration, verb-sense, and more.
Ail Framework
⭐
440
AIL framework - Analysis Information Leak framework
Rong360
⭐
438
用户贷款风险预测
Grimoirelab
⭐
432
GrimoireLab: platform for software development analytics and insights
Dgfraud
⭐
432
A Deep Graph-based Toolbox for Fraud Detection
Chefboost
⭐
428
A Lightweight Decision Tree Framework supporting regular algorithms: ID3, C4,5, CART, CHAID and Regression Trees; some advanced techniques: Gradient Boosting, Random Forest and Adaboost w/categorical features support for Python
Matminer
⭐
422
Data mining for materials science
Book Socialmediaminingpython
⭐
415
Companion code for the book "Mastering Social Media Mining with Python"
Pulsarrpa
⭐
413
Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.
Rmdl
⭐
409
RMDL: Random Multimodel Deep Learning for Classification
Ffxiv Datamining
⭐
407
This repository is to serve as a place to share data mining information related to Final Fantasy XIV.
Mli Resources
⭐
405
H2O.ai Machine Learning Interpretability Resources
Awesome Deep Community Detection
⭐
397
Deep and conventional community detection related papers, implementations, datasets, and tools.
Knowage Server
⭐
387
Knowage is the professional open source suite for modern business analytics over traditional sources and big data systems.
Instamancer
⭐
380
Scrape Instagram's API with Puppeteer
Suod
⭐
371
(MLSys' 21) An Acceleration System for Large-scare Unsupervised Heterogeneous Outlier Detection (Anomaly Detection)
Hnpickup
⭐
358
This is an educational example of a data mining web application: when is good time to post on HN
Reaper
⭐
355
Social media scraping / data collection tool for the Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs
Fraud Detection Handbook
⭐
352
Reproducible Machine Learning for Credit Card Fraud Detection - Practical Handbook
Pydatalab
⭐
347
open source for wechat-official-account (ID: PyDataLab)
1-100 of 926 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.