Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for data catalog
data-catalog
x
60 search results found
Datahub
⭐
8,889
The Metadata Platform for the Modern Data Stack
Amundsen
⭐
4,262
Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.
Openmetadata
⭐
3,512
Open Standard for Metadata. A Single place to Discover, Collaborate and Get your data right.
Odd Platform
⭐
1,047
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Intake
⭐
976
Intake is a lightweight package for finding, investigating, loading and disseminating data.
Whale
⭐
710
🐳 The stupidly simple CLI workspace for your data warehouse.
Awesome Data Catalogs
⭐
441
📙 Awesome Data Catalogs and Observability Platforms.
Recap
⭐
292
Work with your web service, database, and streaming schemas in a single format.
Piicatcher
⭐
215
Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub
Meteor
⭐
168
Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.
Gravitino
⭐
153
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
Bigquery Data Lineage
⭐
132
Reference implementation for real-time Data Lineage tracking for BigQuery using Audit Logs, ZetaSQL and Dataflow.
Intake Esm
⭐
128
An intake plugin for parsing an Earth System Model (ESM) catalog and loading assets into xarray datasets.
Metamapper
⭐
60
Metamapper is a data discovery and documentation platform for improving how teams understand and interact with their data.
Datacatalog Connectors
⭐
47
Commons code used by the Data Catalog connectors, and links for the connectors sample code.
Aws Dbs Refarch Datalake
⭐
47
Reference Architectures for Datalakes on AWS
Datacatalog Tag Engine
⭐
43
Tag Engine lets you automate the process of creating and populating metadata tags with Google Cloud's Data Catalog. Tag Engine is licensed under the Apache 2 license terms. Please make sure to read, understand and agree to the terms of the LICENSE and CONTRIBUTING files before proceeding.
Colid Documentation
⭐
41
The documentation repository is part of the Corporate Linked Data Catalog - short: COLID - application.
Datacatalog Connectors Rdbms
⭐
40
Sample code with integration between Data Catalog and RDBMS data sources.
Odd Collector
⭐
39
Open-source metadata collector based on ODD Specification
Data Detective
⭐
35
Data catalog for everything in your company
Nada
⭐
35
National Data Archive (NADA) is an open source data cataloging system that serves as a portal for researchers to browse, search, compare, apply for access, and download relevant census or survey information. It was originally developed to support the establishment of national survey data archives.
Grizzly
⭐
34
End-to-end DataOps platform deployed by Terraform.
Stairlight
⭐
25
A data lineage tool detects table dependencies from rendered SQL statements.
Datacatalog Connectors Bi
⭐
24
Sample code with integration between Data Catalog and BI data sources.
Pace
⭐
24
Data policy IN, dynamic view OUT: PACE is the Policy As Code Engine. It helps you to programatically create and apply a data policy to a processing platform like Databricks, Snowflake or BigQuery, with definitions imported from Collibra, Datahub, ODD and the like.
Dataportals Registry
⭐
22
Registry of data portals, catalogs, data repositories including data catalogs dataset and catalog description standard
Egeria Docs
⭐
22
Documentation repository for the Egeria project.
Portal.js.bak
⭐
20
🌀 The JS data presentation framework. For a single dataset to a full catalog.
Datacatalog Connectors Hive
⭐
19
Sample code with integration between Data Catalog and Hive data source.
Frontend
⭐
18
SciCat open data catalogue web client
Remote_climate_data
⭐
17
a collection of remote climate data accessed via intake cached to disk
Datacatalog Tag Manager
⭐
15
Python package to manage Google Cloud Data Catalog tags, loading metadata from external sources -- currently supports the CSV file format
Data
⭐
15
data catalogs and utilities
Google Datacatalog Dbt Tag
⭐
15
Update a Google Data Catalog tag with dbt Cloud run metadata
Darkseal
⭐
14
A Single place to Discover, Collaborate, and Get your data right
Aeda
⭐
14
Build a data catalog by running a single line of code
Colid Setup
⭐
14
The setup repository is part of the Corporate Linked Data Catalog - short: COLID - application. It helps setting up a local environment based on Docker Compose.
Midden
⭐
13
A research metadata catalog and metadata editor that integrates into common workflows used in academic research.
Sddi Ckan K8s
⭐
13
Helm chart for Smart District Data Infrastructure enabled CKAN
Articat
⭐
13
articat: data artifact catalog
Awesome Opendata Software
⭐
11
Awesome list of the software tools related to opendata: data catalogs, ingestion tools, data prep tools and so on
Carte
⭐
11
A Python library to generate static data catalog sites. Carte scrapes metadata from your data assets and generates a fully searchable front end that's just HTML.
Esm Collection Spec
⭐
11
Earth System Model Collection specification
Herd Mdl
⭐
11
Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Data Nasa Gov Frontpage
⭐
10
a frontpage for data.nasa.gov
Datacat
⭐
10
A system for managing files and file replicas across many diverse sites
Pudl Catalog
⭐
9
An Intake catalog for distributing open energy system data liberated by Catalyst Cooperative.
Gift
⭐
9
Gold Idea First Templates covering data, analytics and visualization.
Colid Data Marketplace Frontend
⭐
9
The Data Marketplace frontend repository is part of the Corporate Linked Data Catalog - short: COLID - application. Users can search for registered resources in COLID. It provides a search bar, aggregation filters and search result displaying including term highlighting.
Datacatalog Tag History
⭐
8
Historical metadata of your data warehouse is a treasure trove to discover not just insights about changing data patterns, but also quality and user behaviour. This solution creates Data Catalog Tags history in BigQuery since Data Catalog keeps only the latest version of metadata for fast searchability.
Datacatalog Util
⭐
8
A Python package to centralize some Google Cloud Data Catalog scripts, this repo contains commands like bulk CSV operations that help leverage Data Catalog features.
Gcp Datacatalog Python
⭐
7
Python samples to help Data Citizens who work with Google Cloud Data Catalog
Ckanext Asc Csa
⭐
7
📈 Extension CKAN pour le portail de données et information ouvertes de l'ASC | 📈CKAN Extension for the CSA open data and information portal
Polar Eo Database
⭐
6
Polar Earth Observation Database of satellite sensors
Colid Editor Frontend
⭐
6
The editor frontend repository is part of the Corporate Linked Data Catalog - short: COLID - application. It offers user an metadata based user interface to register resources in COLID.
Colid Search Service
⭐
5
The search service repository is part of the Corporate Linked Data Catalog - short: COLID - application. It makes the data findable and provides indexing and search functionalities based on Elasticsearch.
Intake Nested Yaml Catalog
⭐
5
Supports a single YAML file hierarchical catalog to organize datasets and avoid a data swamp.
Colid Indexing Crawler Service
⭐
5
The Indexing Crawler Service (ICS) repository is part of the Corporate Linked Data Catalog - short: COLID - application. It is responsible to extract data from a RDF storage system, transform and enrich the data and finally to send it via a message queue to the DMP Webservice for indexing.
Analytics_data_where_house
⭐
5
An analytics engineering sandbox focusing on real estates prices in Cook County, IL
1-60 of 60 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.