Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for hadoop mapreduce
hadoop
x
mapreduce
x
838 search results found
Data Science Ipython Notebooks
⭐
25,025
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Bigdata Notes
⭐
13,291
大数据入门指南 ⭐️
Cookbook
⭐
11,769
The Data Engineering Cookbook
Hive
⭐
4,840
Apache Hive
Scalding
⭐
3,433
A Scala API for Cascading
Mrjob
⭐
2,584
Run MapReduce jobs on Hadoop or Amazon Web Services
Poseidon
⭐
1,543
A search engine which can hold 100 trillion lines of log data.
Mongo Hadoop
⭐
1,511
MongoDB Connector for Hadoop
Bigdata Interview
⭐
1,397
🎯 🌟[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop
Data Algorithms Book
⭐
973
MapReduce, Spark, Java, and Scala for Data Algorithms Book
Bigdata Growth
⭐
907
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
Cdap
⭐
707
An open source framework for building data analytic applications.
Elephantdb
⭐
540
Distributed database specialized in exporting key/value data from Hadoop
Bigdata Ecosystem
⭐
536
BigData Ecosystem Dataset
Scoobi
⭐
485
A Scala productivity framework for Hadoop.
Bigdata
⭐
358
💎🔥大数据学习笔记
Cascading
⭐
330
Cascading is a feature rich API for defining and executing complex and fault tolerant data processing flows locally or on a cluster.
Behemoth
⭐
284
Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.
Hadoop Connectors
⭐
267
Libraries and tools for interoperability between Hadoop-related open-source software and Google Cloud Platform.
Parkour
⭐
261
Hadoop MapReduce in idiomatic Clojure.
Hadoopy
⭐
244
Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.
Hadoop Docker
⭐
210
基于Docker构建的Hadoop开发测试环境,包含Hadoop,Hive,HBase,Spark
Commoncrawl Crawler
⭐
208
The Common Crawl Crawler Engine and Related MapReduce code (2008-2012)
Hadoop Pcap
⭐
202
Hadoop library to read packet capture (PCAP) files
Wonderdog
⭐
193
Bulk loading for elastic search
Terrapin
⭐
168
Serving system for batch generated data sets
Juicy Bigdata
⭐
162
🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉
Cc Mrjob
⭐
157
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
Spatialhadoop2
⭐
148
The second generation of SpatialHadoop that ships as an extension
Learning Hadoop And Spark
⭐
148
Companion to Learning Hadoop and Learning Spark courses on Linked In Learning
Hadoop R
⭐
135
Example code for running R on Hadoop
Hadoopdemo
⭐
128
Hadoop简单应用案例,包括MapReduce、单词统计、HDFS基本操作、web日志分析、Zoo
Hipi
⭐
128
HIPI: Hadoop Image Processing Interface
Sequenceiq Samples
⭐
119
SequenceIQ Hadoop examples
Hackathon
⭐
114
Library and resources for hack/reduce Hackathon events
Asakusafw
⭐
113
Asakusa Framework
Avro Hadoop Starter
⭐
111
Example MapReduce jobs in Java, Hive, Pig, and Hadoop Streaming that work on Avro data.
Gora
⭐
110
The Apache Gora open source framework provides an in-memory data model and persistence for big data.
Hadron
⭐
110
Construct and run Hadoop MapReduce programs in Haskell
Dynamometer
⭐
110
A tool for scale and performance testing of HDFS with a specific focus on the NameNode.
Introtohadoopandmr__udacity_course
⭐
103
🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"
Distributed Statistical Computing
⭐
98
Teaching Materials for Distributed Statistical Computing (大数据分布式计算教学材料)
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Focusbigdata
⭐
89
【大数据成神之路学习路径+面经+简历】
Annotated Wikiextractor
⭐
88
Simple Wikipedia plain text extractor with article link annotations and Hadoop support.
Elastic Mapreduce Ruby
⭐
86
Amazon's elastic mapreduce ruby client. Ruby 1.9.X compatible
Lemur
⭐
85
Lemur is a tool to launch hadoop jobs locally or on EMR, based on a configuration file, referred to as a jobdef. The jobdef file describes your EMR cluster, local environment, pre- and post-actions and zero or more "steps".
Solutions Google Compute Engine Cluster For Hadoop
⭐
81
This sample app will get up and running quickly with a Hadoop cluster on Google Compute Engine. For more information on running Hadoop on GCE, read the papers at https://cloud.google.com/resources/.
Programmingwithscalding
⭐
81
Programming MapReduce with Scalding
Chukwa
⭐
78
Mirror of Apache Chukwa
Hbase Orm
⭐
72
A production-grade HBase ORM library that makes accessing HBase clean, fast and fun (Can also be used as Bigtable ORM)
Guagua
⭐
72
An iterative computing framework for both Hadoop MapReduce and Hadoop YARN.
Hadoop Map Reduce Patterns
⭐
71
Hadoop Map-Reduce Design Patterns
Scala Hadoop
⭐
70
Using Hadoop with Scala
Hadoop Java Example
⭐
66
A very simple example of using Hadoop's MapReduce functionality in Java.
Hadoop Bam
⭐
66
Hadoop-BAM is a Java library for the manipulation of files in common bioinformatics formats using the Hadoop MapReduce framework
Src
⭐
62
A light-weight distributed stream computing framework for Golang
Hive Io Experimental
⭐
62
Hive I/O Library
Pybigdata
⭐
56
使用 python 操作大数据的各种组件
Clickhouse Hdfs Loader
⭐
54
loading hdfs data to clickhouse
Snabler
⭐
54
Parallel Algorithms in Python for Hadoop/Mapreduce
Mlhadoop
⭐
53
This repository contains Machine-Learning MapReduce codes for Hadoop which are written from scratch (without using any package or library). E.g. Prediction (Linear and Logistic Regression), Clustering (K-Means), Classification (KNN) etc.
Mapreduce Demo
⭐
53
Hadoop,MapReduce编程学习练手实例
Hadoop Solr
⭐
51
Code to index HDFS to Solr using MapReduce
Hadoop_vision
⭐
49
Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"
Hadoop Papyrus
⭐
48
Hadoop MapReduce DSL framework by Ruby. Changed from hadoop-rubydsl.
Hadoop Sstable
⭐
47
Splittable Input Format for Reading Cassandra SSTables Directly
Hbasedoc_cn
⭐
46
HBase 0.95版中文文档翻译
Hadoop
⭐
46
A Hanborq optimized Hadoop Distribution, especially with high performance of MapReduce. It's the core part of HDH (Hanborq Distribution with Hadoop for Big Data Engineering).
Simr
⭐
45
Spark In MapReduce (SIMR) - launching Spark applications on existing Hadoop MapReduce infrastructure
Big_data
⭐
45
A collection of tutorials on Hadoop, MapReduce, Spark, Docker
Locis
⭐
44
Implementation of "A Parallel Spatial Co-location Mining Algorithm Based on MapReduce" paper
Machine_learning_in_action_py3
⭐
44
Important book about the machine learning algorithms, and introduces the application of those who use these algorithms and tools, and how to use them in a real environment. This book and other books, behind the other books are long on machine learning theory knowledge, the book happened to be more discussion on how to use coded machine learning algorithms.
Code Of Spark Big Data Business Trilogy
⭐
42
This is code of book "Spark Big Data Business Trilogy"
P3
⭐
42
An open source pcap packet and NetFlow file analysis tool using Hadoop MapReduce and Hive.
Pallet Hadoop
⭐
41
Hadoop Cluster Management with Intelligent Defaults
Barclamp Pig
⭐
41
[UNMAINTAINED] Hadoop Pig: Mapreduce Programming component
Devops
⭐
40
DevOps
Sizzle
⭐
39
A compiler and runtime for Google's Sawzall language, optimized for Hadoop
Gomrjob
⭐
39
gomrjob - a Go Framework for Hadoop Map Reduce Jobs
Unoexample
⭐
38
MapReduce/Hadoop example that uses regular playing cards to show mapping and reducing.
Csds Material
⭐
38
Course material for the Computer Systems for Data Science class at Columbia
Hadoop Guide
⭐
36
🐘 关于 HDFS,Yarn,MapReduce,HBase,Hive,Pig,Sqoop,Flume,Zoo 等大数据框架的学习笔记
Hadoop_exporter
⭐
35
A hadoop exporter for prometheus, scrape hadoop metrics (including HDFS, YARN, MAPREDUCE, HBASE. etc.) from hadoop components jmx url.
Replephant
⭐
35
A Clojure library to interactively analyze Hadoop cluster usage via REPL
Kassandramrhelper
⭐
35
Library for processing Cassandra SSTables with Hadoop MapReduce.
Intellij Hadoop
⭐
35
Run Hadoop program using Intellij
Cc Warc Examples
⭐
35
CommonCrawl WARC/WET/WAT examples and processing code for Java + Hadoop
Marklogic Contentpump
⭐
34
MarkLogic Contentpump (mlcp)
Haskell_hadoop
⭐
34
Haskell module for streaming hadoop MapReduce jobs
Data Infra Projects
⭐
34
List of some interesting projects
Avro2parquet
⭐
33
Hadoop MapReduce tool to convert Avro data files to Parquet format.
Cc Helloworld
⭐
33
CommonCrawl Hello World example
Efflux
⭐
31
Easy Hadoop Streaming and MapReduce interfaces in Rust
Webarchive Indexing
⭐
30
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Bigfatlm
⭐
30
Hadoop MapReduce training of modified Kneser-Ney smoothed language models
Nativetask
⭐
29
Hadoop task level native runtime
Mongoreduce
⭐
29
Hadoop Input and Ouput formats for MongoDB
Emr S3 Io
⭐
29
Hadoop IO for Amazon S3
Mongo Deep Mapreduce
⭐
28
Use Hadoop MapReduce directly on Mongo data
Related Searches
Java Hadoop (2,117)
Spark Hadoop (1,188)
Hadoop Hdfs (1,082)
Shell Hadoop (766)
Python Hadoop (761)
Java Mapreduce (759)
Hadoop Hive (703)
Apache Hadoop (514)
Scala Hadoop (479)
Hadoop Hbase (470)
1-100 of 838 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.