Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for apache big data
apache
x
big-data
x
206 search results found
Spark
⭐
36,844
Apache Spark - A unified analytics engine for large-scale data processing
Flink
⭐
22,018
Apache Flink
Cookbook
⭐
11,769
The Data Engineering Cookbook
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Beam
⭐
7,159
Apache Beam is a unified programming model for Batch and Streaming data processing.
Zeppelin
⭐
6,161
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Hive
⭐
5,095
Apache Hive
Ignite
⭐
4,548
Apache Ignite
Calcite
⭐
4,039
Apache Calcite
Koalas
⭐
3,291
Koalas: pandas API on Apache Spark
Flume
⭐
2,448
Mirror of Apache Flume
Parquet Mr
⭐
2,167
Apache Parquet
Ambari
⭐
1,991
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Spark
⭐
1,930
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
Drill
⭐
1,837
Apache Drill is a distributed MPP query layer for self describing data
Bookkeeper
⭐
1,788
Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads
Carbondata
⭐
1,389
High performance data store solution
Spark Doc Zh
⭐
1,186
Apache Spark 官方文档中文版
Phoenix
⭐
996
Mirror of Apache Phoenix
Accumulo
⭐
995
Apache Accumulo
Adam
⭐
955
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Coding Now
⭐
925
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、
Tispark
⭐
862
TiSpark is built for running Apache Spark on top of TiDB/TiKV
Dataflowjavasdk
⭐
853
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
Sqoop
⭐
820
Mirror of Apache Sqoop
Incubator Livy
⭐
819
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
Samza
⭐
783
Mirror of Apache Samza
Ozone
⭐
689
Scalable, redundant, and distributed object store for Apache Hadoop
Orc
⭐
626
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Giraph
⭐
582
Mirror of Apache Giraph
Spark Rapids
⭐
578
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
Spline
⭐
539
Data Lineage Tracking And Visualization Solution
Bigdata Ecosystem
⭐
536
BigData Ecosystem Dataset
Parquetviewer
⭐
533
Simple windows desktop application for viewing & querying Apache Parquet files
Bigtop
⭐
532
Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components.
Nussknacker
⭐
503
Low-code tool for automating actions on real time data | Stream processing for the users.
Datawave
⭐
493
DataWave is an ingest/query framework that leverages Apache Accumulo to provide fast, secure data access.
Hudi Resources
⭐
487
汇总Apache Hudi相关资料
Helix
⭐
432
Mirror of Apache Helix
Tez
⭐
430
Apache Tez
Sparkler
⭐
401
Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.
Couchdb Fauxton
⭐
352
Fauxton is the new Web UI for CouchDB
Apex Core
⭐
346
Mirror of Apache Apex core
Hyperspace
⭐
334
An open source indexing subsystem that brings index-based query acceleration to Apache Spark™ and big data workloads.
Morpheus
⭐
327
Morpheus brings the leading graph query language, Cypher, onto the leading distributed processing platform, Spark.
Parquet Dotnet
⭐
319
🏐 Apache Parquet for modern .NET
Parquet Cpp
⭐
312
Apache Parquet
Every Single Day I Tldr
⭐
307
A daily digest of the articles or videos I've found interesting, that I want to share with you.
Trafodion
⭐
243
Apache Trafodion
Succinct
⭐
239
Enabling queries on compressed data.
Couchdb Docker
⭐
233
Semi-official Apache CouchDB Docker images
Node Hbase
⭐
232
Asynchronous HBase client for NodeJs using REST
Azure Event Hubs Spark
⭐
225
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Flink Notes
⭐
223
flink学习笔记
Calcite Avatica
⭐
211
Apache Calcite Avatica
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Fluo
⭐
181
Apache Fluo
Spark.jl
⭐
180
Julia binding for Apache Spark
Tipdm
⭐
178
TipDM建模平台,开源的数据挖掘工具。
Knox
⭐
169
Mirror of Apache Knox
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Metamodel
⭐
144
Mirror of Apache Metamodel
Storm Doc Zh
⭐
143
Apache Storm 官方文档中文版
Incubator Wayang
⭐
142
Apache Wayang(incubating) is the first cross-platform data processing system.
Incubator Liminal
⭐
131
Apache Liminals goal is to operationalise the machine learning process, allowing data scientists to quickly transition from a successful experiment to an automated pipeline of model training, validation, deployment and inference in production. Liminal provides a Domain Specific Language to build ML workflows on top of Apache Airflow.
Apex Malhar
⭐
131
Mirror of Apache Apex malhar
Tajo
⭐
129
Mirror of Apache Tajo
Hama
⭐
127
Mirror of Apache Hama
Flink Web
⭐
125
Apache Flink Website
Flink Shaded
⭐
121
Apache Flink shaded artifacts repository
Mnemonic
⭐
114
Apache Mnemonic - A non-volatile hybrid memory storage oriented library
Calcite Avatica Go
⭐
110
Mirror of Apache Calcite - Avatica Go SQL Driver
Gora
⭐
109
The Apache Gora open source framework provides an in-memory data model and persistence for big data.
Frank Kanes Taming Big Data With Apache Spark And Python
⭐
106
Frank Kane's Taming Big Data with Apache Spark and Python, published by Packt
Crunch
⭐
100
Mirror of Apache Crunch (Incubating)
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Falcon
⭐
95
Mirror of Apache Falcon
Reef
⭐
92
Mirror of Apache REEF
Airavata
⭐
89
A general purpose Distributed Systems Framework
Predictionio Template Recommender
⭐
78
PredictionIO Recommendation Engine Template (Scala-based parallelized engine)
Cleanframes
⭐
70
type-class based data cleansing library for Apache Spark SQL
The Apache Ignite Book
⭐
66
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Incubator Tez
⭐
60
Mirror of Apache Tez (Incubating)
Lens
⭐
57
Mirror of Apache Lens
Oodt
⭐
55
Mirror of Apache OODT
Data_processing_course
⭐
53
Some class materials for a data processing course using PySpark
Phoenix Connectors
⭐
47
Apache Phoenix Connectors
R4ml
⭐
45
Scalable R for Machine Learning
Doris Website
⭐
45
Apache Doris Website
Phoenix Queryserver
⭐
39
Apache Phoenix Query Server
Flink Book
⭐
38
大数据,流计算,实时计算,Flink框架学习资料。畅销书籍《深入理解Flink核心设计与实践原理》 随书代码,书中讲解的Flink特性均有完整可运行的代码供读者运行和测试。整个工程共有【182个Jav
Predictionio Template Attribute Based Classifier
⭐
38
PredictionIO Classification Engine Template (Scala-based parallelized engine)
Ambari Metrics
⭐
34
Apache Ambari Metrics is a sub project of Apache Ambari.
Fluo Uno
⭐
34
Apache Fluo Uno
Predictionio Template Text Classifier
⭐
33
Text Classification Engine
Nifi
⭐
32
Deploy a secured, clustered, auto-scaling NiFi service in AWS.
Accumulo Examples
⭐
32
Apache Accumulo Examples
Kibble
⭐
30
Apache Kibble - a tool to collect, aggregate and visualize data about any software project
Apache Hive Essentials Second Edition
⭐
27
Apache Hive Essentials, Second Edition published by Packt
Related Searches
Java Apache (4,331)
Php Apache (2,291)
Javascript Apache (1,450)
Python Apache (1,438)
Shell Apache (1,374)
Docker Apache (1,277)
Apache Spark (1,207)
Mysql Apache (865)
Apache Kafka (836)
Scala Apache (705)
1-100 of 206 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.