Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for data profiling
data-profiling
x
30 search results found
Ydata Profiling
⭐
12,222
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Great_expectations
⭐
9,179
Always know what to expect from your data.
Cleanlab
⭐
8,696
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Openmetadata
⭐
3,512
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
Sweetviz
⭐
2,687
Visualize and compare datasets, target values and associations, with one line of code.
Soda Core
⭐
1,644
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Optimus
⭐
1,447
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
Odd Platform
⭐
1,047
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Cleanvision
⭐
739
Automatically find issues in image datasets and practice data-centric computer vision.
Traceml
⭐
493
Engine for ML/Data tracking, visualization, explainability, drift detection, and dashboards for Polyaxon.
Popmon
⭐
461
Monitor the stability of a Pandas or Spark dataframe ⚙︎
Haupt
⭐
451
Lineage metadata API, artifacts streams, sandbox, API, and spaces for Polyaxon
Piperider
⭐
443
Code review for data in dbt
Bumblebee
⭐
124
🚕 A spreadsheet-like data preparation web app that works over Optimus (Pandas, Dask, cuDF, Dask-cuDF, Spark and Vaex)
Swiple
⭐
72
Swiple enables you to easily observe, understand, validate and improve the quality of your data
Desbordante
⭐
54
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
Data Profiling
⭐
46
a set of scripts to pull meta data and data profiling metrics from relational database systems
Odd Collector
⭐
39
Open-source metadata collector based on ODD Specification
Dqo
⭐
37
Data Quality and Observability platform with custom rules, data quality KPIs and data quality dashboards. Measure the data quality, not only observe it!
Auctus
⭐
34
Dataset search engine, discovering data from a variety of sources, profiling it, and allowing advanced queries on the index
Metacrafter
⭐
34
Metadata and data identification tool and Python library. Identifies PII, common identifiers, language specific identifiers. Fully customizable and flexible rules
Fta
⭐
21
Metadata/data identification Java library. Identifies Semantic Type information (e.g. Gender, Age, Color, Country,...). Extensive country/language support. Extensible via user-defined plugins. Comprehensive Profiling support.
Raymon
⭐
17
The official http://raymon.ai data profiling and logging library.
Cleanlab Studio
⭐
16
Client interface for all things Cleanlab Studio
Roomba
⭐
12
A Node.js tool to examine the correctness of Open Data Metadata and build custom dataset profiles
Dataqtor
⭐
11
🔍Your Data Quality Detector / Gain insight into your data and get it ready for use before you start working with it 💡📊🛠💎
Gate
⭐
10
Drift detection module for machine learning pipelines.
Greatex
⭐
10
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
Data Cleaning
⭐
7
Data cleaning tool.
Kglids
⭐
6
Linked Data Science powered by Knowledge Graphs
1-30 of 30 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.