Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for apache big data
apache
x
big-data
x
124 search results found
Spark
⭐
37,661
Apache Spark - A unified analytics engine for large-scale data processing
Flink
⭐
22,747
Apache Flink
Cookbook
⭐
12,557
The Data Engineering Cookbook
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Beam
⭐
7,355
Apache Beam is a unified programming model for Batch and Streaming data processing.
Hive
⭐
5,222
Apache Hive
Ignite
⭐
4,626
Apache Ignite
Calcite
⭐
4,216
Apache Calcite
Koalas
⭐
3,291
Koalas: pandas API on Apache Spark
Flume
⭐
2,475
Mirror of Apache Flume
Parquet Mr
⭐
2,296
Apache Parquet
Ambari
⭐
2,030
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Spark
⭐
1,963
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Drill
⭐
1,856
Apache Drill is a distributed MPP query layer for self describing data
Bookkeeper
⭐
1,828
Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads
Carbondata
⭐
1,401
High performance data store solution
Spark Doc Zh
⭐
1,186
Apache Spark 官方文档中文版
Phoenix
⭐
1,006
Mirror of Apache Phoenix
Accumulo
⭐
1,003
Apache Accumulo
Adam
⭐
966
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Coding Now
⭐
925
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、
Tispark
⭐
872
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Dataflowjavasdk
⭐
853
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Incubator Livy
⭐
840
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Sqoop
⭐
820
Mirror of Apache Sqoop
Samza
⭐
792
Mirror of Apache Samza
Orc
⭐
645
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Spark Rapids
⭐
619
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
Giraph
⭐
582
Mirror of Apache Giraph
Parquetviewer
⭐
574
Simple windows desktop application for viewing & querying Apache Parquet files
Nussknacker
⭐
564
Low-code tool for automating actions on real time data | Stream processing for the users.
Spline
⭐
553
Data Lineage Tracking And Visualization Solution
Bigtop
⭐
549
Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components.
Bigdata Ecosystem
⭐
536
BigData Ecosystem Dataset
Datawave
⭐
512
DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
Hudi Resources
⭐
509
汇总Apache Hudi相关资料
Tez
⭐
446
Apache Tez
Helix
⭐
440
Mirror of Apache Helix
Sparkler
⭐
401
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Couchdb Fauxton
⭐
361
Fauxton is the new Web UI for CouchDB
Apex Core
⭐
346
Mirror of Apache Apex core
Hyperspace
⭐
334
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Morpheus
⭐
330
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Parquet Dotnet
⭐
319
🏐 Apache Parquet for modern .NET
Parquet Cpp
⭐
312
Apache Parquet
Every Single Day I Tldr
⭐
311
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Trafodion
⭐
243
Apache Trafodion
Couchdb Docker
⭐
242
Semi-official Apache CouchDB Docker images
Succinct
⭐
239
Enabling queries on compressed data.
Node Hbase
⭐
232
Asynchronous HBase client for NodeJs using REST
Azure Event Hubs Spark
⭐
225
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Calcite Avatica
⭐
225
Apache Calcite Avatica
Flink Notes
⭐
223
flink学习笔记
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Fluo
⭐
183
Apache Fluo
Spark.jl
⭐
180
Julia binding for Apache Spark
Tipdm
⭐
178
TipDM建模平台,开源的数据挖掘工具。
Knox
⭐
174
Mirror of Apache Knox
Incubator Wayang
⭐
162
Apache Wayang(incubating) is the first cross-platform data processing system.
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Metamodel
⭐
144
Mirror of Apache Metamodel
Storm Doc Zh
⭐
143
Apache Storm 官方文档中文版
Parquetsharp
⭐
142
ParquetSharp is a .NET library for reading and writing Apache Parquet files.
Flink Web
⭐
133
Apache Flink Website
Incubator Liminal
⭐
131
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Apex Malhar
⭐
131
Mirror of Apache Apex malhar
Flink Shaded
⭐
130
Apache Flink shaded artifacts repository
Tajo
⭐
129
Mirror of Apache Tajo
Hama
⭐
127
Mirror of Apache Hama
Mnemonic
⭐
115
Apache Mnemonic - A non-volatile hybrid memory storage oriented library
Gora
⭐
111
The Apache Gora open source framework provides an in-memory data model and persistence for big data.
Calcite Avatica Go
⭐
110
Mirror of Apache Calcite - Avatica Go SQL Driver
Frank Kanes Taming Big Data With Apache Spark And Python
⭐
106
Frank Kane's Taming Big Data with Apache Spark and Python, published by Packt
Crunch
⭐
100
Mirror of Apache Crunch (Incubating)
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Falcon
⭐
95
Mirror of Apache Falcon
Airavata
⭐
92
A general purpose Distributed Systems Framework
Reef
⭐
92
Mirror of Apache REEF
Predictionio Template Recommender
⭐
78
PredictionIO Recommendation Engine Template (Scala-based parallelized engine)
The Apache Ignite Book
⭐
72
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Cleanframes
⭐
70
type-class based data cleansing library for Apache Spark SQL
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Incubator Tez
⭐
60
Mirror of Apache Tez (Incubating)
Lens
⭐
57
Mirror of Apache Lens
Oodt
⭐
55
Mirror of Apache OODT
Data_processing_course
⭐
53
Some class materials for a data processing course using PySpark
Doris Website
⭐
51
Apache Doris Website
Phoenix Connectors
⭐
48
Apache Phoenix Connectors
R4ml
⭐
45
Scalable R for Machine Learning
Phoenix Queryserver
⭐
41
Apache Phoenix Query Server
Predictionio Template Attribute Based Classifier
⭐
38
PredictionIO Classification Engine Template (Scala-based parallelized engine)
Flink Book
⭐
38
大数据,流计算,实时计算,Flink框架学习资料。畅销书籍《深入理解Flink核心设计与实践原理》 随书代码,书中讲解的Flink特性均有完整可运行的代码供读者运行和测试。整个工程共有【182个Jav
Accumulo Examples
⭐
34
Apache Accumulo Examples
Ambari Metrics
⭐
34
Apache Ambari Metrics is a sub project of Apache Ambari.
Predictionio Template Text Classifier
⭐
33
Text Classification Engine
Nifi
⭐
32
Deploy a secured, clustered, auto-scaling NiFi service in AWS.
Kibble
⭐
30
Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Beam Site
⭐
27
Apache Beam Site
Apache Hive Essentials Second Edition
⭐
27
Apache Hive Essentials, Second Edition published by Packt
Airavata Django Portal
⭐
27
Apache Airavata Django Portal Framework
Related Searches
Java Apache (4,331)
Php Apache (2,627)
Shell Apache (1,492)
Javascript Apache (1,450)
Python Apache (1,438)
Docker Apache (1,277)
Apache Spark (1,207)
Mysql Apache (961)
Apache Kafka (836)
Scala Apache (705)
1-100 of 124 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.