Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for apache hadoop
apache
x
hadoop
x
193 search results found
Spark
⭐
37,661
Apache Spark - A unified analytics engine for large-scale data processing
Cookbook
⭐
12,557
The Data Engineering Cookbook
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Hive
⭐
5,222
Apache Hive
Bigdl
⭐
4,728
Accelerate LLM with low-bit (FP4 / INT4 / FP8 / INT8) optimizations using bigdl-llm
Ignite
⭐
4,626
Apache Ignite
Calcite
⭐
4,216
Apache Calcite
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Nutch
⭐
2,742
Apache Nutch is an extensible and scalable web crawler
Elasticsearch Hadoop
⭐
1,914
🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop
Drill
⭐
1,856
Apache Drill is a distributed MPP query layer for self describing data
Atlas
⭐
1,685
Apache Atlas
Carbondata
⭐
1,401
High performance data store solution
Dr Elephant
⭐
1,301
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Hadoop Docker
⭐
1,169
Hadoop docker image
Impala
⭐
1,044
Apache Impala
Kylo
⭐
1,035
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Awesome Hadoop
⭐
987
A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources
Coding Now
⭐
925
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、
Sqoop
⭐
820
Mirror of Apache Sqoop
Hawq
⭐
677
Apache HAWQ
Pig
⭐
659
Mirror of Apache Pig
Sparkr Pkg
⭐
649
R frontend for Spark
Orc
⭐
645
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Giraph
⭐
582
Mirror of Apache Giraph
Spline
⭐
553
Data Lineage Tracking And Visualization Solution
Bigtop
⭐
549
Bigtop is an Apache Foundation project for Infrastructure Engineers and Data Scientists looking for comprehensive packaging, testing, and configuration of the leading open source big data components.
Bigdata Ecosystem
⭐
536
BigData Ecosystem Dataset
Tez
⭐
446
Apache Tez
Hadoopinternals
⭐
424
Diagrams describing Apache Hadoop internals (2.3.0 or later).
Eagle
⭐
410
Mirror of Apache Eagle
Graphx
⭐
353
Former GraphX development repository. GraphX has been merged into Apache Spark; please submit pull requests there.
Apex Core
⭐
346
Mirror of Apache Apex core
Easyhadoop
⭐
310
Apache hadoop management system
Behemoth
⭐
284
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Sparkstreaming
⭐
253
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志
Hive Jdbc Uber Jar
⭐
252
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Kerberos_and_hadoop
⭐
248
Kerberos and Hadoop: The Madness beyond the Gate
Trafodion
⭐
243
Apache Trafodion
Node Hbase
⭐
232
Asynchronous HBase client for NodeJs using REST
Calcite Avatica
⭐
225
Apache Calcite Avatica
Emr Dynamodb Connector
⭐
210
Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Docker Flink
⭐
157
Apache Flink docker image
Bigdata Playground
⭐
154
A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apache Storm, Twitter Api, MongoDB, NodeJS, Angular, GraphQL
Logparser
⭐
153
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Flink, Beam, Storm, Drill, ...
Hdfs_fdw
⭐
131
PostgreSQL foreign data wrapper for HDFS
Parquet Rs
⭐
129
Apache Parquet implementation in Rust
Tajo
⭐
129
Mirror of Apache Tajo
Skein
⭐
126
A tool and library for easily deploying applications on Apache YARN
Docker Spark
⭐
118
Docker image for general apache spark client
Bdutil
⭐
114
[DEPRECATED] Script used to manage Hadoop and Spark instances on Google Compute Engine
Gora
⭐
111
The Apache Gora open source framework provides an in-memory data model and persistence for big data.
Calcite Avatica Go
⭐
110
Mirror of Apache Calcite - Avatica Go SQL Driver
Datafu
⭐
110
Mirror of Apache DataFu
Linkedin Gradle Plugin For Apache Hadoop
⭐
106
Crux
⭐
101
Crux is a reporting application for HBase. Crux provides a simple web based graphical interface to access HBase, query data and create reports. Crux is open sourced under Apache Software Foundation License v2.0.
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Reef
⭐
92
Mirror of Apache REEF
Halyard
⭐
88
Halyard is an extremely horizontally scalable Triplestore with support for Named Graphs, designed for integration of extremely large Semantic Data Models, and for storage and SPARQL 1.1 querying of the whole Linked Data universe snapshots.
Docker Cloudera Quickstart
⭐
87
Docker Cloudera Quick Start Image
Spork
⭐
84
Pig on Apache Spark
Phphiveadmin
⭐
81
An Apache Hive management system
Chukwa
⭐
78
Mirror of Apache Chukwa
Docker Spark
⭐
77
🚢 Docker image for Apache Spark
Waimak
⭐
73
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Implyr
⭐
73
SQL backend to dplyr for Impala
The Apache Ignite Book
⭐
72
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Hcatalog
⭐
61
Mirror of Apache HCatalog
Incubator Tez
⭐
60
Mirror of Apache Tez (Incubating)
Stormtweetssentimentanalysis
⭐
60
Computes sentiment analysis of tweets of US States in real-time using Storm.
Rhipe
⭐
54
R and Hadoop Integrated Programming Environment
Presto Yarn
⭐
52
Cascading Flink
⭐
52
Cascading on Apache Flink®
Bigtop
⭐
51
Bigtop is a project for the development of packaging and tests of the Apache Hadoop ecosystem. The primary goal of Bigtop is to build a community around the packaging and interoperability testing of Hadoop-related projects. This includes testing at various levels (packaging, platform, runtime, upgrade, etc...) developed by a community with a focus on the system as a whole, rather than individual projects.
Doris Website
⭐
51
Apache Doris Website
Lingual
⭐
48
Stand-alone ANSI SQL for Cascading on Apache Hadoop
Pydrill
⭐
46
Python Driver for Apache Drill.
Hfsa
⭐
44
Hadoop FSImage Analyzer (HFSA)
Code Of Spark Big Data Business Trilogy
⭐
42
This is code of book "Spark Big Data Business Trilogy"
Yarn Prometheus Exporter
⭐
39
Export Hadoop YARN (resource-manager) metrics in prometheus format
Cdh Package
⭐
38
Hive Jdbc Driver
⭐
38
An alternative to the "hive standalone" jar for connecting Java applications to Apache Hive via JDBC
Hive Driver
⭐
38
Driver for connection to Apache Hive via Thrift API
Vertica Hadoop Connector
⭐
38
Vertica Hadoop Connector
Xxhadoop
⭐
37
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Test.fm
⭐
36
Testing framework for Collaborative Filtering
Flume Logs
⭐
34
Apache Flume to process log files on Hadoop cluster
Avro Maven Plugin
⭐
34
Maven 2 Plugin for processing Apache Avro files. Avro is a subproject of Apache Hadoop.
Ambari Metrics
⭐
34
Apache Ambari Metrics is a sub project of Apache Ambari.
Recommender
⭐
33
NReco Recommender is a .NET port of Apache Mahout CF java engine (standalone, non-Hadoop version)
Ansible Ambari
⭐
33
Quickly deploy Hadoop with the help of Ansible and Apache Ambari
Nutch Newsclassify
⭐
33
基于nutch的新闻分类系统
Docker Hadoop Ubuntu
⭐
32
A Hadoop image on Ubuntu
Jmxtrans Lib
⭐
32
JMXTrans configuration for hadoop/cassandra/zookeeper
Freebase2rdf
⭐
30
Hive Mr3
⭐
29
Hive for MR3
Netapp Hadoop Nfs Connector
⭐
29
This projects provides a NFSv3 connector for Hadoop. Using the connector, Apache Hadoop and Apache Spark can use NFSv3 server as their storage backend.
Bigdata Docker
⭐
26
Docker images for Open Source bigdata/hadoop projects
Related Searches
Java Apache (4,331)
Php Apache (2,627)
Java Hadoop (2,117)
Shell Apache (1,492)
Javascript Apache (1,450)
Python Apache (1,438)
Docker Apache (1,277)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Mysql Apache (961)
1-100 of 193 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.