Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for java hadoop
hadoop
x
java
x
1,345 search results found
Spark
⭐
35,873
Apache Spark - A unified analytics engine for large-scale data processing
Apijson
⭐
15,348
🏆 零代码、全功能、强安全 ORM 库 🚀 后端接口和文档零代码,前端(客户端) 定制返回 JSON 的数据和结构。 🏆 A JSON Transmission Protocol and an ORM Library 🚀 provides APIs and Docs without writing any code.
Presto
⭐
14,746
The official home of the Presto distributed SQL query engine for big data
Bigdata Notes
⭐
13,291
大数据入门指南 ⭐️
Deeplearning4j
⭐
12,957
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learning using automatic differentiation.
It_book
⭐
8,543
本项目收藏这些年来看过或者听过的一些不错的常用的上千本书籍,没准你想找的书就在这里呢,包含了互联网行
Doris
⭐
8,374
Apache Doris is an easy-to-use, high performance and unified analytics database.
God Of Bigdata
⭐
7,992
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive.
Trino
⭐
7,886
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
H2o 3
⭐
6,294
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
Alluxio
⭐
6,254
Alluxio, data orchestration for analytics and machine learning in the cloud
Hive
⭐
4,833
Apache Hive
Ignite
⭐
4,464
Apache Ignite
Calcite
⭐
3,872
Apache Calcite
Nutch
⭐
2,588
Apache Nutch is an extensible and scalable web crawler
Dataspherestudio
⭐
2,557
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
Bigdataguide
⭐
1,994
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
Ambari
⭐
1,915
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
Elasticsearch Hadoop
⭐
1,904
🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop
Drill
⭐
1,814
Apache Drill is a distributed MPP query layer for self describing data
Xlearning
⭐
1,729
AI on Hadoop
Gaffer
⭐
1,702
A large-scale entity and relation database supporting aggregation of properties
Easyreport
⭐
1,635
A simple and easy to use Web Report System for java.EasyReport是一个简单易用的Web报表工具(支持Hadoop,HBase及各种关系
Flink Streaming Platform Web
⭐
1,562
基于flink的实时流计算web平台
Atlas
⭐
1,555
Apache Atlas
Mongo Hadoop
⭐
1,511
MongoDB Connector for Hadoop
Movie_recommend
⭐
1,441
基于Spark的电影推荐系统,包含爬虫项目、web网站、后台管理系统以及spark推荐系统
Cascalog
⭐
1,378
Data processing on Hadoop without the hassle.
Carbondata
⭐
1,364
High performance data store solution
Dr Elephant
⭐
1,301
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Javapdf
⭐
1,177
🍣100本 Java电子书 技术书籍PDF(以下载阅读为荣,以点赞收藏为耻)
Taier
⭐
1,114
Taier is a big data development platform for submission, scheduling, operation and maintenance, and indicator information display
Elephant Bird
⭐
1,100
Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.
Kylo
⭐
1,035
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.
Data Algorithms Book
⭐
973
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Coding Now
⭐
925
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、
Livy
⭐
911
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Addax
⭐
905
Addax is a versatile open-source ETL tool that can seamlessly transfer data between various RDBMS and NoSQL databases, making it an ideal solution for data migration.
Mr4c
⭐
890
Sqoop
⭐
820
Mirror of Apache Sqoop
Datasketches Java
⭐
819
Core Java Sketch Library.
Hadoop_study
⭐
817
定期更新Hadoop生态圈中常用大数据组件文档 重心依次为: Flink Solr Sparksql ES Scala Kafka Hbase/phoenix Redis Kerberos (项目包含hadoop思维导图 印象笔记 Scala版本简单demo 常用工具类 去敏后的train code 持续更新!!!)
Useractionanalyzeplatform
⭐
810
电商用户行为分析大数据平台
Cdap
⭐
707
An open source framework for building data analytic applications.
Hive Json Serde
⭐
706
Read - Write JSON SerDe for Apache Hive.
Tony
⭐
689
TonY is a framework to natively run deep learning frameworks on Apache Hadoop.
Oozie
⭐
672
Mirror of Apache Oozie
Geometry Api Java
⭐
663
The Esri Geometry API for Java enables developers to write custom applications for analysis of spatial data. This API is used in the Esri GIS Tools for Hadoop and other 3rd-party data processing solutions.
Pig
⭐
656
Mirror of Apache Pig
Ozone
⭐
650
Scalable, redundant, and distributed object store for Apache Hadoop
Orc
⭐
606
Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
Giraph
⭐
582
Mirror of Apache Giraph
Hadoop2x Eclipse Plugin
⭐
549
eclipse plugin for hadoop 2.2.0 , 2.4.1
Elephantdb
⭐
540
Distributed database specialized in exporting key/value data from Hadoop
Bigtop
⭐
500
Mirror of Apache Bigtop
Kafka Connect Hdfs
⭐
452
Kafka Connect HDFS connector
Aircompressor
⭐
452
A port of Snappy, LZO, LZ4, and Zstandard to Java
Marmaray
⭐
444
Generic Data Ingestion & Dispersal Library for Hadoop
Tuiblogs
⭐
443
优秀的计算机编程类博客和文章 share excellent blogs and sites
Indexr
⭐
422
An open-source columnar data format designed for fast & realtime analytic with big data.
Storm Yarn
⭐
419
Storm-yarn enables Storm clusters to be deployed into machines managed by Hadoop YARN.
Tez
⭐
419
Apache Tez
Iceberg
⭐
409
Iceberg is a table format for large, slow-moving tabular data
Sylph
⭐
396
Stream computing platform for bigdata
Oozie
⭐
378
Oozie - workflow engine for Hadoop
Kite
⭐
366
Kite SDK
Bigdata
⭐
358
💎🔥大数据学习笔记
Venice
⭐
353
Venice, Derived Data Platform for Planet-Scale Workloads.
Apex Core
⭐
346
Mirror of Apache Apex core
Shopzz
⭐
344
一个使用SpringCloud Alibaba开发的电商项目,移动端使用Flutter2.x构建,小程序使用uni-app构建,管理 3.0 + Element Plus 进行构建,并在支付上接入数字货币(比特币、以太坊UDST)支付,后端采用Hadoop与Flink等大
Spatial Framework For Hadoop
⭐
343
The Spatial Framework for Hadoop allows developers and data scientists to use the Hadoop data processing system for spatial data analysis.
Cloudbreak
⭐
338
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features.
Cascading
⭐
330
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
Behemoth
⭐
284
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Gridgain Old
⭐
278
Hops
⭐
273
Hops Hadoop is a distribution of Apache Hadoop with distributed metadata.
Hadoop Connectors
⭐
267
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Slimfast
⭐
265
Slimming down jars since 2016
Datacube
⭐
261
Multidimensional data storage with rollups for numerical data
Faunus
⭐
259
Graph Analytics Engine
Demo_11.11_storm Spark Hadoop
⭐
257
hadoop_storm_spark结合实验的例子,模拟淘宝双11节,根据订单详细信息,汇总出总销售 --------大概流程------- 第一阶段(storm实时报表) 第二阶段(离线报表)第三阶段(大规模订单即席查询,和多维度查询) 第四阶段(数据挖掘和图计算)
Sparkstreaming
⭐
253
Spark Streaming+Flume+Kafka+HBase+Hadoop+Zookeeper实现实时日志
Facebook Hive Udfs
⭐
253
Facebook's Hive UDFs
Hadoop Mini Clusters
⭐
251
hadoop-mini-clusters provides an easy way to test Hadoop projects directly in your IDE
Hive Jdbc Uber Jar
⭐
243
Hive JDBC "uber" or "standalone" jar based on the latest Apache Hive version
Es Fastloader
⭐
242
Quickly build large-scale ElasticSearch indices by using the fault tolerance and parallelism of Hadoop
Shifu
⭐
235
An end-to-end machine learning and data mining framework on Hadoop
Big Whale
⭐
225
Spark、Flink等离线任务的调度以及实时任务的监控
Commoncrawl Crawler
⭐
208
The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)
Emr Dynamodb Connector
⭐
204
Implementations of open source Apache Hadoop/Hive interfaces which allow for ingesting data from Amazon DynamoDB
Hadoop Pcap
⭐
202
Hadoop library to read packet capture (PCAP) files
Hadoop Book
⭐
198
Source code to accompany the book "Hadoop in Practice", published by Manning.
Calcite Avatica
⭐
195
Apache Calcite Avatica
Programming Video Tutorials
⭐
195
视频教程:Java, 大数据,云计算,Android,Hadoop,Docker,mysql,spark,CRM,OA..
Wonderdog
⭐
193
Bulk loading for elastic search
S3mper
⭐
192
s3mper - Consistent Listing for S3
Sparkrdma
⭐
191
RDMA accelerated, high-performance, scalable and efficient ShuffleManager plugin for Apache Spark
Wifiprobeanalysis
⭐
189
基于WIFI探针的商业大数据分析技术
Javaorbigdata Interview
⭐
180
Java开发者或者大数据开发者面试知识点整理
Hadoop
⭐
178
Hadoop on Mesos
Related Searches
Java Spring (21,350)
Java Spring Boot (11,982)
Java Jar (7,924)
Java Testing (7,133)
Javascript Java (6,016)
Java Database (5,888)
Java Databases (5,865)
Java Mysql (5,440)
Java Algorithms (4,705)
Java Apache (4,281)
1-100 of 1,345 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.