Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for csv parquet
csv
x
parquet
x
42 search results found
Dsq
⭐
3,401
Commandline tool for running SQL queries against JSON, CSV, Excel, Parquet, and more.
Qsv
⭐
2,079
CSVs sliced, diced & analyzed.
Rill
⭐
1,145
Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.
Choetl
⭐
693
ETL framework for .NET (Parser / Writer for CSV, Flat, Xml, JSON, Key-Value, Parquet, Yaml, Avro formatted files)
Tech.ml.dataset
⭐
616
A Clojure high performance data processing system
Vscode Data Preview
⭐
447
Data Preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large JSON array/config, YAML, Apache Arrow, Avro, Parquet & Excel data files
Elasticsearch_loader
⭐
349
A tool for batch loading data files (json, parquet, csv, tsv) into ElasticSearch
Rumble
⭐
194
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
D6tstack
⭐
166
Quickly ingest messy CSV and XLS files. Export to clean pandas, SQL, parquet
Bdt
⭐
125
Boring Data Tool
Schemer
⭐
89
Schema registry for CSV, TSV, JSON, AVRO and Parquet schema. Supports schema inference and GraphQL API.
Prql Query
⭐
77
Query and transform data with PRQL
Ml Io
⭐
71
A high performance data access library for machine learning tasks
Faker Cli
⭐
61
Command-line interface to quickly generate fake CSV and JSON data
Csv2parquet
⭐
60
Create Parquet files from CSV
Quackpipe
⭐
56
DuckDB for ClickHouse users. QuackPipe is an OLAP API built on top of DuckDB with a few extra ClickHouse compatibility bits.
Parquetize
⭐
51
R package that allows to convert databases of different formats to parquet format
Records Mover
⭐
37
Python library and CLI you can use to move relational data from one place to another - DBs/CSV/gsheets/dataframes/...
Dply Rs
⭐
36
A dataframe manipulation tool inspired by dplyr.
Csv2parquet
⭐
33
Convert a CSV to a parquet file.
Dbd
⭐
29
dbd is a database prototyping tool that enables data analysts and engineers to quickly load and transform data in SQL databases.
Aws Redshift Spectrum Poc
⭐
27
Cloudformation and SQL scripts used to replicate a POC environment from the "Data Lake to Data Warehouse: Enhancing Customer 360 with Amazon Redshift Spectrum" post
Imctermite
⭐
25
Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats
Daflow
⭐
24
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Datareader
⭐
18
Read binary SAS (SAS7BDAT) and Stata (dta) files in the Go (Golang) programming language. Also provides command line tools for working with these file formats.
Spark Sql Gdelt
⭐
16
Scripts and code to import the GDELT dataset into Spark SQL for analysis
Dracula Covid19
⭐
16
An ETL tool for converting untyped CSV to parquet. Also triggers data lake updates.
Fileconvert
⭐
13
Converts between file formats such as CSV and Parquet
Data Generator
⭐
13
This repo is for generating data from existing dataset to a file or producing dataset rows as message to kafka in a streaming manner.
Tableio.jl
⭐
12
A glue package for reading and writing tabular data. It aims to provide a uniform api for reading and writing tabular data from and to multiple sources.
Pynock
⭐
11
A proposed standard `NOCK` for a Parquet format that supports efficient distributed serialization of multiple kinds of graph technologies
Chicagocrimes
⭐
11
Exploring public Chicago crimes data set in Python
Scalpel Flattening
⭐
11
This repository host code related SNDS database flattening
Greatex
⭐
10
A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.
Db2ixf
⭐
10
db2ixf is a python package with a CLI that simplifies the parsing and processing of IBM Integration eXchange Format (IXF) files.
Flinkparquet
⭐
10
Using the Parquet file format (with Avro) to process data with Apache Flink
Pyspark Dataframe Made Easy
⭐
10
pyspark dataframe made easy
Typed Dfs
⭐
8
Make Pandas DataFrames enforce definitions, self-organize, and correctly serialize in 18 formats.
Query
⭐
7
big data query console command and script for scala
Aporia Importer
⭐
7
🏋️♀️ Import inference data from Amazon S3, Azure Blob Storage, Google Cloud Storage and others to Aporia
Csv2parquet2orc
⭐
6
CSV 2 Parquet and CSV2 to ORC converter with aligned interface
Ob_datastash
⭐
6
Stream your CSV files to an HTTP API
Related Searches
Python Csv (5,199)
Javascript Csv (1,924)
Json Csv (1,137)
Php Csv (1,016)
Java Csv (960)
Csv Excel (651)
Database Csv (621)
Jupyter Notebook Csv (602)
Golang Csv (589)
C Sharp Csv (479)
1-42 of 42 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.