Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for data science dataframe
data-science
x
dataframe
x
82 search results found
Modin
⭐
9,275
Modin: Scale your Pandas workflows by changing a single line of code
Vaex
⭐
8,161
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
Cudf
⭐
6,936
cuDF - GPU DataFrame Library
Smile
⭐
5,833
Statistical Machine Intelligence & Learning Engine
Datasciencepython
⭐
4,776
common data analysis and machine learning tasks using python
Danfojs
⭐
4,416
Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.
Mimesis
⭐
4,298
Mimesis is a powerful Python library that empowers developers to generate massive amounts of synthetic data efficiently.
Tablesaw
⭐
3,328
Java dataframe and visualization library
Koalas
⭐
3,291
Koalas: pandas API on Apache Spark
Sweetviz
⭐
2,687
Visualize and compare datasets, target values and associations, with one line of code.
Dataframe
⭐
2,129
C++ DataFrame for statistical, Financial, and ML analysis -- in modern C++ using native types and contiguous memory storage
Sketch
⭐
2,106
AI code-writing assistant that understands data content
Tv
⭐
1,956
📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.
Pandas Videos
⭐
1,808
Jupyter notebook and datasets from the pandas Q&A video series
Tiledb
⭐
1,700
The Universal Storage Engine
Hamilton
⭐
1,272
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
Arcticdb
⭐
1,071
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
Daft
⭐
1,012
Distributed DataFrame for Python designed for the cloud, powered by Rust
Explorer
⭐
915
Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
Hamilton
⭐
877
A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton
Pdpipe
⭐
710
Easy pipelines for pandas DataFrames.
Dataframe
⭐
642
Structured data processing in Kotlin
Dataframe Go
⭐
642
DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration
Tech.ml.dataset
⭐
616
A Clojure high performance data processing system
Datasheets
⭐
613
Read data from, write data to, and modify the formatting of Google Sheets
Data Science Your Way
⭐
532
Ways of doing Data Science Engineering and Machine Learning in R and Python
Traceml
⭐
490
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Dataframe Js
⭐
383
A javascript library providing a new data structure for datascientists and developpers
Qframe
⭐
372
Immutable data frame for Go
Gspread Pandas
⭐
371
A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.
Pandasvault
⭐
353
Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).
Datacompy
⭐
339
Pandas and Spark DataFrame comparison for humans and more!
Data Science Hacks
⭐
300
Data Science Hacks consists of tips, tricks to help you become a better data scientist. Data science hacks are for all - beginner to advanced. Data science hacks consist of python, jupyter notebook, pandas hacks and so on.
Geni
⭐
268
A Clojure dataframe library that runs on Spark
Morpheus Core
⭐
239
The foundational library of the Morpheus data science framework
Rightmove_webscraper.py
⭐
219
Python class to scrape data from rightmove.co.uk and return listings in a pandas DataFrame object
Snowpark Python
⭐
215
Snowflake Snowpark Python API
Pydbgen
⭐
199
Random dataframe and database table generator
Rumble
⭐
194
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Woodwork
⭐
133
Woodwork is a Python library that provides robust methods for managing and communicating data typing information.
Dh Core
⭐
118
Functional data science
Xda
⭐
108
R package for exploratory data analysis
Pulsar Spark
⭐
103
Spark Connector to read and write with Pulsar
Tablexplore
⭐
96
Table analysis and plotting application written in PySide2/PyQt5
Hmni
⭐
92
📛 Fuzzy Name Matching with Machine Learning
Tidypandas
⭐
89
A grammar of data manipulation for pandas inspired by tidyverse
Deepr
⭐
80
Deep R Programming (Open-Access Textbook)
Typedframe
⭐
78
Typed wrappers over pandas DataFrames with schema validation
Vtree
⭐
73
An R package for calculating and drawing variable trees
Dataframe
⭐
69
DataFrame in Pharo - tabular data structures for data analysis
Lens
⭐
67
Summarise and explore Pandas DataFrames
Dataframe
⭐
51
DataFrame Library for Java
Go Dataframe
⭐
50
A simple package to abstract away the process of creating usable DataFrames for data analytics. This package is heavily inspired by the amazing Python library, Pandas.
Apple Health Exporter
⭐
44
Python module to export Apple Health dump file to a data frame for analysis
Stonks.jl
⭐
44
Julia library for standardizing financial data retrieval and storage from multiple APIs.
Purple_air_api
⭐
42
Python package to get and transform PurpleAir data
Wikirepo
⭐
36
Python based Wikidata framework for easy dataframe extraction
Pyspark Algorithms
⭐
33
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Oxbow
⭐
32
Read specialized NGS formats as data frames in R, Python, and more.
10 Simple Hacks To Speed Up Your Data Analysis In Python
⭐
29
Some useful Tips and Tricks to speed up the data analysis process in Python.
Pandas Sqlalchemy Tutorial
⭐
29
🐼 💻 Load or insert data into a SQL database using Pandas DataFrames.
Parides
⭐
29
Prometheus metrics to Panda Dataframe / CSV exporter. Mainly useful to analyze your metrics with datascience tools.
R Data Wrangling
⭐
27
D-Lab's 6 hour introduction to data wrangling with R. Learn how to manipulate dataframes using the tidyverse in R.
Elucidate
⭐
26
convenience functions to help researchers elucidate patterns in their data
Frames Beam
⭐
24
Accessing Postgres in a data frame in Haskell
Pandas Estat
⭐
24
政府統計総合窓口 e-Stat のデータを Pandas DataFrame 形式で取得します。
Cfanalytics
⭐
23
Downloading, analyzing and visualizing CrossFit data
Bow
⭐
23
Go data analysis / manipulation library built on top of Apache Arrow
Dominando Pandas
⭐
22
Este repositório está destinado ao processo de aprendizagem do framework Pandas
Boltzmannclean
⭐
21
Fill missing values in Pandas DataFrames using Restricted Boltzmann Machines
Schrutepy
⭐
20
The Entire Transcript from the Office in Tidy Format
Julia Data Science
⭐
20
Data science and numerical computing with Julia
Frame
⭐
20
A DataFrame for Javascript
Foxcross
⭐
19
AsyncIO serving for data science models
Polypoly
⭐
18
Helper functions for orthogonal polynomials in R
Saddle
⭐
18
SADDLE: Scala Data Library
Crysda
⭐
17
Crystal library for Data Analysis, Wrangling, Munging
Heidi
⭐
17
heidi : tidy data in Haskell
Wrangle
⭐
16
A data transformation package for deep learning with Autonomio, Keras and TensorFlow.
Ml_preprocessing
⭐
16
Implementation of popular data preprocessing algorithms for Machine learning
Disarray
⭐
15
Confusion matrix metrics directly from your pandas DataFrame
Viper
⭐
14
Simple, expressive pipeline syntax to transform and manipulate data with ease
Jandas
⭐
14
A very much Pandas-like JavaScript library for data science
Markovclick
⭐
13
Python package to model clickstream data as a Markov chain. Inspired by R package clickstream.
Ickle
⭐
13
🔍 Experimental DataFrame, statistics and analysis library for Python
Tdf
⭐
12
🚴🏅📊Tour de France winners and stages data
Ml_dataframe
⭐
12
A way to store and manipulate data
Online_preprocessing_for_ml
⭐
12
Web App for preprocessing - Machine Learning - Streamlit
Data Tools
⭐
12
Common Python tools and utilities for data engineering, ETL, Exploration, etc. made opensource and packaged, making it easy to use in any environment.
Tableio.jl
⭐
12
A glue package for reading and writing tabular data. It aims to provide a uniform api for reading and writing tabular data from and to multiple sources.
Annie
⭐
12
A NLP Chatbot trained using a corpus of Reddit data.
Go Df
⭐
11
Dataframes for Golang
Rflow
⭐
10
Flexible R Pipelines with Caching
Ipydataclean
⭐
10
Interactive cleaning for Pandas DataFrames
Rcoboldi
⭐
9
R COBOL DI (Data Integration) Package : Import COBOL CopyBook data files directly into R as properly structured data frames.
Columnar
⭐
9
An idiomatic kotlin dataframe toolkit for data engineering tasks of any size dataset
Icp4d Customer Churn Classifier
⭐
8
Infuse AI into your application. Create and deploy a customer churn prediction model with IBM Cloud Private for Data, Db2 Warehouse, Spark MLlib, and Jupyter notebooks.
Pdf2dataset
⭐
8
Converts a whole subdirectory with a big (or small) volume of PDF documents to a dataset (pandas DataFrame) with error tracking and choice of features
Datapipe
⭐
7
Pipeline API for manipulating dataframes
Datasciforcybersecurity
⭐
6
Open source code and resources arising from the ATI-funded Data Science for Cybersecurity project
Related Searches
Python Data Science (6,905)
Machine Learning Data Science (5,390)
Jupyter Notebook Data Science (3,734)
Python Dataframe (1,170)
R Data Science (1,164)
Deep Learning Data Science (1,039)
Html Data Science (872)
Data Science Pandas (794)
Pandas Dataframe (737)
Artificial Intelligence Data Science (662)
1-82 of 82 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.