Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for spark hive
hive
x
spark
x
225 search results found
Bigdata Notes
⭐
14,872
大数据入门指南 ⭐
Doris
⭐
11,243
Apache Doris is an easy-to-use, high performance and unified analytics database.
God Of Bigdata
⭐
8,483
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Sql Generator
⭐
3,346
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
Linkis
⭐
3,283
Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.
Dataspherestudio
⭐
2,860
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Bigdataguide
⭐
2,355
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Szt Bigdata
⭐
2,055
深圳地铁大数据客流分析系统🚇🚄🌟
Quicksql
⭐
1,939
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
Kyuubi
⭐
1,849
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Movie_recommend
⭐
1,441
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Carbondata
⭐
1,401
High performance data store solution
Bigdata Growth
⭐
1,256
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
Taier
⭐
1,220
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
Hadoop_study
⭐
817
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Scriptis
⭐
767
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
Nessie
⭐
762
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Coral
⭐
680
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
Blinkdb
⭐
625
BlinkDB: Sub-Second Approximate Queries on Very Large Data.
Wedatasphere
⭐
624
WeDataSphere is a financial grade, one-stop big data platform suite.
Yanagishima
⭐
584
Web UI for Trino, Hive and SparkSQL
Hivemall
⭐
508
Scalable machine learning library for Apache Hive/Spark/Pig
Moonbox
⭐
487
Moonbox is a DVtaaS (Data Virtualization as a Service) Platform
Connectors
⭐
383
This library allows Scala and Java-based projects (including Apache Flink, Apache Hive, Apache Beam, and PrestoDB) to read from and write to Delta Lake.
Zdh_web
⭐
379
大数据采集,抽取平台,zdh_web是zdh系列服务的可视化管理平台,包含数据采集,调度,权限,审批
Big_data_architect_skills
⭐
353
一个大数据架构师应该掌握的技能
Transport
⭐
288
A framework for writing performant user-defined functions (UDFs) that are portable across a variety of engines including Apache Spark, Apache Hive, and Presto.
Hadoop Tutorials Examples
⭐
228
Source, data and turotials of the blog post video series of Hue, the Web UI for Hadoop.
Bigdata_docker
⭐
226
Big Data Ecosystem Docker
Hadoop Docker
⭐
210
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Xsql
⭐
207
Unified SQL Analytics Engine Based on SparkSQL
Datacompare
⭐
195
big data comparison and data profiling platform: low code,data comparison and data profiling
Big Data
⭐
190
一个开源、成体系的大数据学习教程。spark学习 hadoop hive hbase flink教程 linux 从入门到精通
Bigdata Hub
⭐
187
数据建设与大数据技术知识体系,包含hadoop、hive、spark、flink主流框架和系列框架,
Aws Glue Data Catalog Client For Apache Hive Metastore
⭐
184
The AWS Glue Data Catalog is a fully managed, Apache Hive Metastore compatible, metadata repository. Customers can use the Data Catalog as a central repository to store structural and operational metadata for their data. AWS Glue provides out-of-box integration with Amazon EMR that enables customers to use the AWS Glue Data Catalog as an external Hive Metastore. This is an open-source implementation of the Apache Hive Metastore client on Amazon EMR clusters that uses the AWS Glue Data Catalog
Dpkb
⭐
182
大数据相关内容汇总,包括分布式存储引擎、分布式计算引擎、数仓建设等。关键词:Hadoop、HBase
Spark 2.3.1
⭐
174
Spark-2.3.1源码解读
Juicy Bigdata
⭐
162
🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉
Spark Authorizer
⭐
158
A Spark SQL extension which provides SQL Standard Authorization for Apache Spark
Huaweicloud Mrs Example
⭐
150
Examples for HUAWEI CLOUD MRS.
Bigdata Learning
⭐
136
大数据学习记录
Hdfs_fdw
⭐
131
PostgreSQL foreign data wrapper for HDFS
Xichuan_note
⭐
114
xichuan的学习总结笔记,覆盖了java、spring、java其他常用框架,以及大数据相关组件
Spark Atlas Connector
⭐
112
A Spark Atlas connector to track data lineage in Apache Atlas
Distributed Statistical Computing
⭐
99
Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)
Jaws Spark Sql Rest
⭐
92
Smart Data Lake
⭐
87
Smart Automation Tool for building modern Data Lakes and Data Pipelines
Flowman
⭐
85
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
Spark Llap
⭐
82
Hadoop_cookbook
⭐
81
Cookbook to install Hadoop 2.0+ using Chef
Spark Acid
⭐
79
ACID Data Source for Apache Spark based on Hive ACID
Bigdata Learning Notes
⭐
79
Luigi Warehouse
⭐
73
A luigi powered analytics / warehouse stack
The Apache Ignite Book
⭐
72
All code samples, scripts and more in-depth examples for The Apache Ignite Book. Include Apache Ignite 2.6 or above
Vagrant Hadoop Spark Hive
⭐
68
Vagrant project to spin up a single virtual machine running current versions of Hadoop, Hive and Spark
Ambari Zeppelin Service
⭐
68
Ambari service for Apache Zeppelin notebook
Terraform Aws Emr Cluster
⭐
67
Terraform module to provision an Elastic MapReduce (EMR) cluster on AWS
Apache Spark Hands On
⭐
64
Educational notes,Hands on problems w/ solutions for hadoop ecosystem
Spark Gpu
⭐
61
Spark GPU and SIMD Support
Hive Metastore Docker
⭐
61
Example for article Running Spark 3 with standalone Hive Metastore 3.0
Iceberg
⭐
59
A temporary home for LinkedIn's changes to Apache Iceberg (incubating)
Apachespark
⭐
59
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies.
Mylearningnotes
⭐
58
Because its never late to start taking notes and 'public' it...
Titandataoperationsystem
⭐
57
最好的大数据项目。《Titan数据运营系统》,本项目是一个全栈闭环系统,我们有用作数据可视化的web Echart等;
Pybigdata
⭐
56
使用 python 操作大数据的各种组件
Bigdataparty
⭐
54
大数据组件 All-in-One 的 Dockerfile
Vagrant Hadoop Hive Spark
⭐
53
Vagrant project to spin up a single node VM running current versions of Hadoop, Hive and Spark
Bestconf
⭐
53
A tool automatically improving the performance of large-scale systems by finding better configuration settings
Spark Training
⭐
52
Repository used for Spark Trainings
Movie Recommender Demo
⭐
50
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
Spark Hive Udf
⭐
47
Example project showing how to use Hive UDFs in Apache Spark
Itachi
⭐
46
A library that brings useful functions from various modern database management systems to Apache Spark
Hadoop Spark Hive Cluster Docker
⭐
45
hadoop-spark-hive-cluster-docker
Docker Hadoop Workbench
⭐
44
A Hadoop cluster based on Docker, including Hive and Spark.
Learnbasicbigdatatech
⭐
44
🚀Some projects on Big Data Analysis like Spark, Hive, Presto and Data Visualization like Superset
Xgbspark Text Classification
⭐
43
XGBoost on Spark for Chinese Text Classification
Smv
⭐
41
Spark Modularized View
Ppextensions
⭐
39
Set of iPython and Jupyter extensions to improve user experience
Awesome Druid
⭐
38
Swordfish
⭐
37
Open-source distribute workflow schedule tools, also support streaming task.
Xxhadoop
⭐
37
Data Analysis Using Hadoop/Spark/Storm/ElasticSearch/MachineLearning etc. This is My Daily Notes/Code/Demo. Don't fork, Just star !
Sharpetl
⭐
36
Write ETL using your favorite SQL dialects
Hadoop Guide
⭐
36
🐘 关于 HDFS,Yarn,MapReduce,HBase,Hive,Pig,Sqoop,Flume,Zoo 等大数据框架的学习笔记
Opendataplatform
⭐
34
An open source, enterprise-scale, vendor-neutral data platform accelerating solution delivery.
Bigdata Docker Compose
⭐
33
Hadoop, Hive, Spark, Zeppelin and Livy: all in one Docker-compose file.
Engineeringteam
⭐
32
와이빅타 엔지니어링팀의 자료를 정리해두는 곳입니다.
Building Data Lakehouse
⭐
32
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
Framework Of Bigdata
⭐
30
大数据面试题,从0到1走向架构师之路。Flink、Spark、Hive、HBase、Hadoop、K
Ireport
⭐
29
数据分析统计报表平台
Squerall
⭐
27
An implementation of the so-called Semantic Data Lake, using Apache Spark and Presto.
Nyyellowtaxiproject
⭐
27
Big Data project using Hadoop (MapReduce, spark, Hive)
Spark Hive Streaming Sink
⭐
26
A sink to save Spark Structured Streaming DataFrame into Hive table
Bigdatasalaryanaliysystem
⭐
26
大数据招聘信息分析平台
Bigdata Doc
⭐
25
大数据学习笔记,学习路线,技术案例整理。
Daflow
⭐
24
Apache-Spark based Data Flow(ETL) Framework which supports multiple read, write destinations of different types and also support multiple categories of transformation rules.
Bigdata Tutorial
⭐
22
Mysql Time Machine
⭐
22
mysql-time-machine project
Prettier Sql
⭐
22
[ARCHIVED] Please use https://github.com/sql-formatter-org/sql-formatter
Spark_log_data
⭐
21
Flume-to-Spark-Streaming Log Parser
Biginsights On Apache Hadoop
⭐
21
Example projects for 'BigInsights for Apache Hadoop' on IBM Bluemix
Related Searches
Scala Spark (3,279)
Python Spark (2,053)
Java Spark (1,587)
Apache Spark (1,207)
Spark Hadoop (1,188)
Jupyter Notebook Spark (1,151)
Spark Kafka (985)
Spark Streaming (817)
Spark Pyspark (812)
Shell Spark (705)
1-100 of 225 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.