Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for java parquet
java
x
parquet
x
71 search results found
Iceberg
⭐
5,179
Apache Iceberg
Parquet Mr
⭐
2,296
Apache Parquet
Drill
⭐
1,856
Apache Drill is a distributed MPP query layer for self describing data
Gaffer
⭐
1,724
A large-scale entity and relation database supporting aggregation of properties
Parquet Format
⭐
1,559
Apache Parquet
Adam
⭐
966
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Tech.ml.dataset
⭐
616
A Clojure high performance data processing system
Iceberg
⭐
409
Iceberg is a table format for large, slow-moving tabular data
Centurion
⭐
318
Kotlin Bigdata Toolkit
Parquet Cpp
⭐
312
Apache Parquet
Bigdata File Viewer
⭐
269
A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
Rumble
⭐
194
⛈️ RumbleDB 1.21.0 "Hawthorn blossom" 🌳 for Apache Spark | Run queries on your large-scale, messy JSON-like data (JSON, text, CSV, Parquet, ROOT, AVRO, SVM...) | No install required (just a jar to download) | Declarative Machine Learning and more
Fhir Data Pipes
⭐
107
A collection of tools for extracting FHIR resources and analytics services on top of that data.
Streamx
⭐
95
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Openstreetmap_h3
⭐
72
OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps
Dendrite
⭐
67
Dendrite is a library for querying large datasets on a single host at near-interactive speeds.
Rainbow
⭐
61
A data layout optimization framework for wide tables stored on HDFS. See rainbow's webpage
Parquet Compatibility
⭐
59
compatibility tests to make sur C and Java implementations can read each other
Iceberg
⭐
59
A temporary home for LinkedIn's changes to Apache Iceberg (incubating)
Osm Parquetizer
⭐
58
A converter for the OSM PBFs to Parquet files
Spark Compaction
⭐
52
File compaction tool that runs on top of the Spark framework.
Entrada
⭐
44
Entrada - A tool for DNS big data analytics
Intellij Avro Parquet Plugin
⭐
40
A Tool Window plugin for IntelliJ that displays Avro and Parquet files and their schemas in JSON.
Parquet Testing
⭐
37
Auxiliary files for compatibility and integration tests for Apache Parquet
Paraflow
⭐
36
A real-time analytical system for ID-associated data
Avro2parquet
⭐
33
Hadoop MapReduce tool to convert Avro data files to Parquet format.
Arvo2parquet
⭐
30
Example program that writes Parquet formatted data to plain files (i.e., not Hadoop hdfs); Parquet is a columnar storage format.
Avro Json
⭐
29
Utilities for converting to and from JSON from Avro records via Hadoop streaming or Hive.
Meepo
⭐
27
异构存储数据迁移
Iow Hadoop Streaming
⭐
26
Set of hadoop input/output formats for use in combination with hadoop streaming
Parquet Flinktacular
⭐
26
How to use Parquet in Flink
Kafka Parquet Writer
⭐
26
This project provides a compenent that reads logs from Kafka and writes it as parquet file on HDFS.
Parquet Tools
⭐
25
Command line tools for the parquet project
Hbase Tohdfs
⭐
23
Reads a HBase table and writes the out as Text, Seq, Avro, or Parquet
Kafka Connect Oss
⭐
21
Kafka Connect suite of connectors for OSS
Embulk Output Parquet
⭐
20
Streaming Data Platform
⭐
19
Albis
⭐
17
Albis: High-Performance File Format for Big Data Systems
Hadoop Etl Udfs
⭐
17
The Hadoop ETL UDFs are the main way to load data from Hadoop into EXASOL
Parquet Cli
⭐
15
Parquet Command-line Tools
Experience Platform Etl Reference
⭐
13
Examples for ETL Integrations with Adobe Experience Platform
Parquet Avro Protobuf
⭐
12
Example: Convert Protobuf to Parquet using parquet-avro and avro-protobuf
Hadoop Snippets
⭐
12
Parquet Floor
⭐
12
A lightweight Java library that facilitates reading and writing Apache Parquet files without Hadoop dependencies
Parquetplugin
⭐
11
Intelqatcodec
⭐
11
Beampipelinesamples
⭐
10
Provides different code samples for Apache Beam and DataFlow
Flink Tools
⭐
10
A collection of Flink applications for working with Pravega streams
Flinkparquet
⭐
10
Using the Parquet file format (with Avro) to process data with Apache Flink
Flink10_learn
⭐
9
flink 10 自我学习笔记和代码
Aws Ingesting Click Logs Using Terraform
⭐
9
Provision AWS infrastructure using Terraform (By HashiCorp): an example of web application logging customer data
Avro Cli
⭐
9
Yet Another Avro CLI Tool
Parquet Resultset
⭐
8
The parquet-resultset library can be used to convert standard sql result sets into parquet.
Example Applications
⭐
8
Example applications for use with PNDA
Jcascalog Parquet Example
⭐
7
Random Datagen
⭐
7
A generator of Random Data to HDFS, HBase, Hive, Kafka, Kudu, Ozone, SolR in CDP (Cloudera Data Platform)
Score
⭐
7
ScORe - Programmatic Schema On Read for Spark SQL, powered by Taboola
Parquet Cascalog
⭐
7
Cascading sink tap for parquet files, with some minimal clojure bindings
Sempala
⭐
7
Sempala is a SPARQL-over-SQL approach to provide interactive-time SPARQL query processing on Hadoop. It stores RDF data in a columnar layout (Parquet) on HDFS and uses either Impala or Spark as the execution layer on top of it. SPARQL queries are translated into Impala/Spark SQL for execution.
Avrotoolbox
⭐
7
ArcGIS toolbox to process feature classes in Apache Avro and Parquet format
Parquet Io Java
⭐
6
Java library to read Parquet files.
Drillbook
⭐
6
The Official Source Repository for Learning Apache Drill (O'Reilly, 2018)
Avrotoparquet
⭐
6
Command line converter for Apache Avro to Apache Parquet file formats
Csv2parquet2orc
⭐
6
CSV 2 Parquet and CSV2 to ORC converter with aligned interface
Insight_project
⭐
6
Reddit Network Analytics
Parquet Mr Example
⭐
5
Benchmarking Arrow
⭐
5
Benchmarking Arrow/Java
Avroparquet
⭐
5
AVRO / Parquet Demo Code
Tessellate
⭐
5
A data engineering cli for reading and writing data to/from multiple locations across multiple formats.
Streamsx.parquet
⭐
5
(Incubation) Toolkit providing adapters to Parquet
Parquet Rewriter
⭐
5
A library to mutate parquet files
Related Searches
Java Spring (21,350)
Java Spring Boot (11,982)
Java Video Game (8,093)
Java Gradle (8,072)
Java Docker (6,180)
Java Database (6,015)
Java Mysql (5,954)
Java Sdk (5,864)
Javascript Java (5,468)
Java Rest (4,956)
1-71 of 71 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.