Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for hadoop hive
hadoop
x
hive
x
716 search results found
Apijson
⭐
15,902
🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
Presto
⭐
15,102
The official home of the Presto distributed SQL query engine for big data
Bigdata Notes
⭐
14,410
大数据入门指南 ⭐️
Doris
⭐
9,624
Apache Doris is an easy-to-use, high performance and unified analytics database.
Trino
⭐
8,571
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Hive
⭐
5,095
Apache Hive
Dataspherestudio
⭐
2,746
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Bigdataguide
⭐
2,257
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Drill
⭐
1,837
Apache Drill is a distributed MPP query layer for self describing data
Kyuubi
⭐
1,723
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Szt Bigdata
⭐
1,702
深圳地铁大数据客流分析系统🚇🚄🌟
Mongo Hadoop
⭐
1,511
MongoDB Connector for Hadoop
Movie_recommend
⭐
1,441
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Carbondata
⭐
1,386
High performance data store solution
Taier
⭐
1,189
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
Elephant Bird
⭐
1,100
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
Bigdata Growth
⭐
1,052
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
Awesome Hadoop
⭐
987
A curated list of amazingly awesome Hadoop and Hadoop ecosystem resources
Addax
⭐
973
Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration.
Docker Hive
⭐
918
Hadoop_study
⭐
817
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Hive Json Serde
⭐
706
Read - Write JSON SerDe for Apache Hive.
Wedatasphere
⭐
593
WeDataSphere is a financial grade, one-stop big data platform suite.
Data Engineering Interview Questions
⭐
554
More than 2000+ Data engineer interview questions.
Gis Tools For Hadoop
⭐
495
The GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data.
Hadoop Ansible
⭐
416
Ansible playbook that installs a Hadoop cluster, with HBase, Hive, Presto for analytics, and Ganglia, Smokeping, Fluentd, Elasticsearch and Kibana for monitoring and centralized log indexing.
Bigdata
⭐
358
💎🔥大数据学习笔记
Big_data_architect_skills
⭐
353
一个大数据架构师应该掌握的技能
Trendingtopics
⭐
351
Rails app for tracking trends in server logs - powered by the Cloudera Hadoop Distribution on EC2
Spatial Framework For Hadoop
⭐
343
The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.
Facebook Hive Udfs
⭐
255
Facebook's Hive UDFs
Hive Jdbc Uber Jar
⭐
248
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Hadoop Tutorials Examples
⭐
228
Source, data and turotials of the blog post video series of Hue, the Web UI for Hadoop.
Bigdata_docker
⭐
226
Big Data Ecosystem Docker
Hadoop Docker
⭐
210
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Emr Dynamodb Connector
⭐
204
Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB
Hadoop Pcap
⭐
202
Hadoop library to read packet capture (PCAP) files
Haproxy Configs
⭐
198
80+ HAProxy Configs for Hadoop, Big Data, NoSQL, Docker, Kubernetes, Elasticsearch, SolrCloud, HBase, MySQL, PostgreSQL, Apache Drill, Hive, Presto, Impala, Hue, ZooKeeper, SSH, RabbitMQ, Redis, Riak, Cloudera, OpenTSDB, InfluxDB, Prometheus, Kibana, Graphite, Rancher etc.
Crunch
⭐
196
A fast to develop, fast to run, Go based toolkit for ETL and feature extraction on Hadoop.
Big Data
⭐
190
一个开源、成体系的大数据学习教程。spark学习 hadoop hive hbase flink教程 linux 从入门到精通
Dpkb
⭐
171
大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase
Juicy Bigdata
⭐
162
🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉
Logparser
⭐
148
Easy parsing of Apache HTTPD and NGINX access logs with Java, Hadoop, Hive, Flink, Beam, Storm, Drill, ...
Eel Sdk
⭐
140
Big Data Toolkit for the JVM
Bigdata Hub
⭐
138
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,
Bigdata Learning
⭐
136
大数据学习记录
Hdfs_fdw
⭐
131
PostgreSQL foreign data wrapper for HDFS
Hadoopdemo
⭐
128
Hadoop简单应用案例,包括MapReduce、单词统计、HDFS基本操作、web日志分析、Zoo
Rhive
⭐
124
RHive is an R extension facilitating distributed computing via Apache Hive.
Kylin Docker
⭐
116
This repository trackes the code and files for building docker image with Apache Kylin.
Xichuan_note
⭐
114
xichuan的学习总结笔记,覆盖了java、spring、java其他常用框架,以及大数据相关组件
Avro Hadoop Starter
⭐
111
Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Distributed Statistical Computing
⭐
100
Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)
Streamx
⭐
95
kafka-connect-s3 : Ingest data from Kafka to Object Stores(s3)
Wifi
⭐
95
基于wifi抓取信息的大数据查询分析系统
Schedoscope
⭐
95
Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or whatever you choose to call your Hadoop data warehouse these days.
My Tutorial
⭐
93
我想构建形成自己的知识的体系,工作职位是大数据,所以主要还是以大数据为主,从主流框架Hadoop,S 大数据开发是很繁琐的,正确的运行环境是成功的第一步,所以我尽量从搭建,部署,开发整个流程都做出来,单
Flink Sql Benchmark
⭐
92
Hiho
⭐
84
Hadoop Data Integration with various databases, ftp servers, salesforce. Incremental update, dedup, append, merge your data on Hadoop.
Smart Data Lake
⭐
83
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Devops Perl Tools
⭐
82
25+ DevOps CLI Tools - Anonymizer, SQL ReCaser (MySQL, PostgreSQL, AWS Redshift, Snowflake, Apache Drill, Hive, Impala, Cassandra CQL, Microsoft SQL Server, Oracle, Couchbase N1QL, Dockerfiles), Hadoop HDFS & Hive tools, Solr/SolrCloud CLI, Nginx stats & HTTP(S) URL watchers for load-balanced web farms, Linux tools etc.
Flowman
⭐
81
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Phphiveadmin
⭐
81
An Apache Hive management system
Hadoop_cookbook
⭐
80
Cookbook to install Hadoop 2.0+ using Chef
Howl
⭐
77
Common metadata layer for Hadoop's Map Reduce, Pig, and Hive
Hive_ql_parser
⭐
73
Searchanalytics Bigdata
⭐
71
Customer Product search clicks analytics using big data Hadoop, Hive, Oozie, ElasticSearch, Akka, Spring Data
Pxf
⭐
71
Platform Extension Framework: Federated Query Engine
Hive_test
⭐
68
Unit test framework for hive and hive-service
Vagrant Hadoop Spark Hive
⭐
68
Vagrant project to spin up a single virtual machine running current versions of Hadoop, Hive and Spark
Datamingproject
⭐
67
大数据平台相关代码(ES/Hive/Hadoop/hdfs/hbase)
Terraform Aws Emr Cluster
⭐
67
Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS
Structor
⭐
67
Vagrant files creating multi-node virtual Hadoop clusters with or without security.
The Apache Ignite Book
⭐
66
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Sqlwindowing
⭐
65
SQL Windowing Functions for Hadoop
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Hive Io Experimental
⭐
62
Hive I/O Library
Spark Gpu
⭐
61
Spark GPU and SIMD Support
Hive Funnel Udf
⭐
61
Hive UDFs for funnel analysis
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Mylearningnotes
⭐
58
Because its never late to start taking notes and 'public' it...
Essentials
⭐
57
Titandataoperationsystem
⭐
57
最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web Echart等;
Pybigdata
⭐
56
使用 python 操作大数据的各种组件
Bigdataparty
⭐
54
大数据组件 All-in-One 的 Dockerfile
Vagrant Hadoop Hive Spark
⭐
53
Vagrant project to spin up a single node VM running current versions of Hadoop, Hive and Spark
Bestconf
⭐
53
A tool automatically improving the performance of large-scale systems by finding better configuration settings
Spark Training
⭐
52
Repository used for Spark Trainings
Redshift Benchmark
⭐
52
Til
⭐
51
Today I Learned
Qds Sdk Py
⭐
51
Python SDK for accessing Qubole Data Service
Clickstream Tutorial
⭐
51
Code for Tutorial on designing clickstream analytics application using Hadoop
Movie Recommender Demo
⭐
50
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
Dplyr Spark
⭐
49
spark backend for dplyr
Hadoop Unit
⭐
45
Hadoop-Unit is a project which allow testing projects which need hadoop ecosysteme like kafka, solr, hdfs, hive, hbase, ...
Cloudera Cookbook
⭐
45
Cloudera (Hadoop + Hive) chef cookbook
Hadoop Spark Hive Cluster Docker
⭐
45
hadoop-spark-hive-cluster-docker
Flume Kafka Storm
⭐
45
大数据实时计算的基础框架
Doris Website
⭐
45
Apache Doris Website
Related Searches
Java Hadoop (2,117)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Hadoop Mapreduce (851)
Shell Hadoop (766)
Python Hadoop (761)
Java Hive (706)
Spark Hive (529)
Apache Hadoop (514)
Scala Hadoop (479)
1-100 of 716 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.