Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for apache hadoop
apache
x
hadoop
x
342 search results found
Spark
⭐
36,808
Apache Spark - A unified analytics engine for large-scale data processing
Cookbook
⭐
11,769
The Data Engineering Cookbook
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Hive
⭐
5,089
Apache Hive
Ignite
⭐
4,544
Apache Ignite
Bigdl
⭐
4,392
Accelerating LLM with low-bit (INT3 / INT4 / NF4 / INT5 / INT8) optimizations using bigdl-llm
Calcite
⭐
4,038
Apache Calcite
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Nutch
⭐
2,675
Apache Nutch is an extensible and scalable web crawler
Ambari
⭐
1,991
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Elasticsearch Hadoop
⭐
1,915
🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop
Drill
⭐
1,837
Apache Drill is a distributed MPP query layer for self describing data
Atlas
⭐
1,643
Apache Atlas
Carbondata
⭐
1,376
High performance data store solution
Dr Elephant
⭐
1,301
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Hadoop Docker
⭐
1,169
Hadoop docker image
Kylo
⭐
1,035
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Impala
⭐
1,010
Apache Impala
Awesome Hadoop
⭐
987
A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources
Coding Now
⭐
925
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、
Sqoop
⭐
820
Mirror of Apache Sqoop
Ozone
⭐
688
Scalable, redundant, and distributed object store for Apache Hadoop
Hawq
⭐
677
Apache HAWQ
Pig
⭐
659
Mirror of Apache Pig
Sparkr Pkg
⭐
649
R frontend for Spark
Orc
⭐
625
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Giraph
⭐
582
Mirror of Apache Giraph
Spline
⭐
538
Data Lineage Tracking And Visualization Solution
Bigdata Ecosystem
⭐
536
BigData Ecosystem Dataset
Bigtop
⭐
532
Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components.
Tez
⭐
430
Apache Tez
Hadoopinternals
⭐
424
Diagrams describing Apache Hadoop internals (2.3.0 or later).
Eagle
⭐
410
Mirror of Apache Eagle
Graphx
⭐
353
Former GraphX development repository. GraphX has been merged into Apache Spark; please submit pull requests there.
Apex Core
⭐
346
Mirror of Apache Apex core
Easyhadoop
⭐
310
Apache hadoop management system
Behemoth
⭐
284
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Hadoop Connectors
⭐
274
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Sparkstreaming
⭐
253
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志
Kerberos_and_hadoop
⭐
248
Kerberos and Hadoop: The Madness beyond the Gate
Hive Jdbc Uber Jar
⭐
248
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Trafodion
⭐
243
Apache Trafodion
Node Hbase
⭐
232
Asynchronous HBase client for NodeJs using REST
Calcite Avatica
⭐
211
Apache Calcite Avatica
Emr Dynamodb Connector
⭐
204
Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Docker Flink
⭐
157
Apache Flink docker image
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Logparser
⭐
148
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Flink, Beam, Storm, Drill, ...
Incubator Wayang
⭐
142
Apache Wayang(incubating) is the first cross-platform data processing system.
Hdfs_fdw
⭐
131
PostgreSQL foreign data wrapper for HDFS
Parquet Rs
⭐
129
Apache Parquet implementation in Rust
Tajo
⭐
129
Mirror of Apache Tajo
Skein
⭐
126
A tool and library for easily deploying applications on Apache YARN
Docker Spark
⭐
118
Docker image for general apache spark client
Bdutil
⭐
114
[DEPRECATED] Script used to manage Hadoop and Spark instances on Google Compute Engine
Calcite Avatica Go
⭐
110
Mirror of Apache Calcite - Avatica Go SQL Driver
Gora
⭐
109
The Apache Gora open source framework provides an in-memory data model and persistence for big data.
Stocator
⭐
108
Stocator is high performing connector to object storage for Apache Spark, achieving performance by leveraging object storage semantics.
Datafu
⭐
106
Mirror of Apache DataFu
Linkedin Gradle Plugin For Apache Hadoop
⭐
106
Crux
⭐
101
Crux is a reporting application for HBase. Crux provides a simple web based graphical interface to access HBase, query data and create reports. Crux is open sourced under Apache Software Foundation License v2.0.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Reef
⭐
92
Mirror of Apache REEF
Halyard
⭐
88
Halyard is an extremely horizontally scalable Triplestore with support for Named Graphs, designed for integration of extremely large Semantic Data Models, and for storage and SPARQL 1.1 querying of the whole Linked Data universe snapshots.
Docker Cloudera Quickstart
⭐
87
Docker Cloudera Quick Start Image
Spork
⭐
84
Pig on Apache Spark
Phphiveadmin
⭐
81
An Apache Hive management system
Chukwa
⭐
78
Mirror of Apache Chukwa
Docker Spark
⭐
77
🚢 Docker image for Apache Spark
Implyr
⭐
73
SQL backend to dplyr for Impala
Waimak
⭐
73
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
The Apache Ignite Book
⭐
66
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Hcatalog
⭐
61
Mirror of Apache HCatalog
Stormtweetssentimentanalysis
⭐
60
Computes sentiment analysis of tweets of US States in real-time using Storm.
Incubator Tez
⭐
60
Mirror of Apache Tez (Incubating)
Rhipe
⭐
54
R and Hadoop Integrated Programming Environment
Presto Yarn
⭐
52
Docker Hadoop
⭐
52
Docker image for main Apache Hadoop components (Yarn/Hdfs)
Cascading Flink
⭐
52
Cascading on Apache Flink®
Bigtop
⭐
51
Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a community around the packaging and interoperability testing of Hadoop-related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc...) developed by a community with a focus on the system as a whole, rather than individual projects.
Lingual
⭐
48
Stand-alone ANSI SQL for Cascading on Apache Hadoop
Pydrill
⭐
46
Python Driver for Apache Drill.
Doris Website
⭐
45
Apache Doris Website
Hfsa
⭐
43
Hadoop FSImage Analyzer (HFSA)
Code Of Spark Big Data Business Trilogy
⭐
42
This is code of book "Spark Big Data Business Trilogy"
Yarn Prometheus Exporter
⭐
39
Export Hadoop YARN (resource-manager) metrics in prometheus format
Cdh Package
⭐
38
Vertica Hadoop Connector
⭐
38
Vertica Hadoop Connector
Hive Driver
⭐
38
Driver for connection to Apache Hive via Thrift API
Xxhadoop
⭐
37
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Test.fm
⭐
36
Testing framework for Collaborative Filtering
Avro Maven Plugin
⭐
34
Maven 2 Plugin for processing Apache Avro files. Avro is a subproject of Apache Hadoop.
Fluo Uno
⭐
34
Apache Fluo Uno
Ambari Metrics
⭐
34
Apache Ambari Metrics is a sub project of Apache Ambari.
Flume Logs
⭐
34
Apache Flume to process log files on Hadoop cluster
Nutch Newsclassify
⭐
33
基于nutch的新闻分类系统
Ansible Ambari
⭐
33
Quickly deploy Hadoop with the help of Ansible and Apache Ambari
Recommender
⭐
33
NReco Recommender is a .NET port of Apache Mahout CF java engine (standalone, non-Hadoop version)
Related Searches
Java Apache (4,331)
Php Apache (2,291)
Java Hadoop (2,117)
Javascript Apache (1,450)
Python Apache (1,438)
Shell Apache (1,374)
Docker Apache (1,277)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Mysql Apache (865)
1-100 of 342 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.