Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for java data lake
data-lake
x
java
x
7 search results found
Trino
⭐
9,118
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Starrocks
⭐
7,191
StarRocks, a Linux Foundation project, is a next-generation sub-second MPP OLAP database for full analytics scenarios, including multi-dimensional analytics, real-time analytics, and ad-hoc queries. InfoWorld’s 2023 BOSSIE Award for best open source software.
Dinky
⭐
2,657
Dinky is a data development platform based on Apache Flink, enabling agile data development and deployment.
Lakesoul
⭐
2,248
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Bitsail
⭐
1,514
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
Kylo
⭐
1,035
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Zingg
⭐
828
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
Amoro
⭐
617
Amoro is a Lakehouse management system built on open data lake formats.
Marmaray
⭐
444
Generic Data Ingestion & Dispersal Library for Hadoop
Hivemq Mqtt Tensorflow Kafka Realtime Iot Machine Learning Training Inference
⭐
159
Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required
Gravitino
⭐
153
World's most powerful data catalog service with providing a high-performance, geo-distributed and federated metadata lake.
Streamis
⭐
96
Streaming application development and management system, based on Linkis and DSS, planning to provide the workflow-like graphical drag-and-drop development capability.
Accio
⭐
43
Accio - Query Your Data Warehouse Like Exploring One Big View.
Hiveberg
⭐
16
Demonstration of a Hive Input Format for Iceberg
Herd Mdl
⭐
11
Herd-MDL, a turnkey managed data lake in the cloud. See https://finraos.github.io/herd-mdl/ for more information.
Related Searches
Java Spring (21,350)
Java Spring Boot (11,982)
Java Video Game (8,093)
Java Gradle (8,072)
Java Docker (6,180)
Java Database (6,015)
Java Mysql (5,954)
Java Sdk (5,864)
Javascript Java (5,468)
Java Rest (4,956)
1-7 of 7 search results
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.