Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python mapreduce
mapreduce
x
python
x
204 search results found
Data Science Ipython Notebooks
⭐
25,025
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
Dev Setup
⭐
5,802
macOS development environment setup: Easy-to-understand instructions with automated setup scripts for developer tools like Vim, Sublime Text, Bash, iTerm, Python data analysis, Spark, Hadoop MapReduce, AWS, Heroku, JavaScript web development, Android development, common data stores, and dev-based OS X defaults.
Dpark
⭐
2,637
Python clone of Spark, a MapReduce alike framework in Python
Mrjob
⭐
2,584
Run MapReduce jobs on Hadoop or Amazon Web Services
Cdap
⭐
707
An open source framework for building data analytic applications.
Mincemeatpy
⭐
467
Lightweight MapReduce in python
Tdigest
⭐
332
t-Digest data structure in Python. Useful for percentiles and quantiles, including distributed enviroments like PySpark
Hadoopy
⭐
244
Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.
Appengine Mapreduce
⭐
222
A library for running MapReduce jobs on App Engine
Juicy Bigdata
⭐
162
🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉
Cc Mrjob
⭐
157
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
Pda_book
⭐
154
Code Examples Data Science using Python
Data Algorithms With Spark
⭐
146
O'Reilly Book: [Data Algorithms with Spark] by Mahmoud Parsian
Kaylee
⭐
123
MapReduce with ZeroMQ
Python Bigdata
⭐
104
Data science and Big Data with Python
Introtohadoopandmr__udacity_course
⭐
103
🐘 Source code for assignments of Udacity course "Introduction to Hadoop and MapReduce"
Dampr
⭐
103
Python Data Processing library
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Big Data Engineering Coursera Yandex
⭐
91
Big Data for Data Engineers Coursera Specialization from Yandex
Annotated Wikiextractor
⭐
88
Simple Wikipedia plain text extractor with article link annotations and Hadoop support.
Solutions Google Compute Engine Cluster For Hadoop
⭐
81
This sample app will get up and running quickly with a Hadoop cluster on Google Compute Engine. For more information on running Hadoop on GCE, read the papers at https://cloud.google.com/resources/.
Flox
⭐
80
Fast & furious GroupBy operations for dask.array
Rail
⭐
70
Scalable RNA-seq analysis
Chess
⭐
60
A MapReduce job to explore blunders in chess games.
Pypar
⭐
58
Efficient and scalable parallelism using the message passing interface (MPI) to handle big data and highly computational problems.
Pybigdata
⭐
56
使用 python 操作大数据的各种组件
Social Graph Analysis
⭐
56
Social Graph Analysis using Elastic MapReduce and PyPy
Snabler
⭐
54
Parallel Algorithms in Python for Hadoop/Mapreduce
Prosto
⭐
53
Prosto is a data processing toolkit radically changing how data is processed by heavily relying on functions and operations with functions - an alternative to map-reduce and join-groupby
Hadoop_vision
⭐
49
Example code for "Web-Scale Computer Vision using MapReduce for Multimedia Data Mining"
Ipdc
⭐
44
IPDC(InterPlanetary Distributed Computing) is the Distributed Computation service, A peer-to-peer hypermedia protocol to make the computation faster, open, and more scalable.
Machine_learning_in_action_py3
⭐
44
Important book about the machine learning algorithms, and introduces the application of those who use these algorithms and tools, and how to use them in a real environment. This book and other books, behind the other books are long on machine learning theory knowledge, the book happened to be more discussion on how to use coded machine learning algorithms.
Telemetry Server
⭐
41
Server for the Mozilla Telemetry project
Pig
⭐
40
Package for Apache Pig support in Sublime Text 2 and 3
Hanhan Spark Python
⭐
40
Used Spark core python, Spark sql, Spark MLlib, Spark Streaming
Mpms
⭐
39
Simple python Multiprocesses-Multithreads queue 简易Python多进程-多线程任务队列, 也能做简单的MapReduce, 自用性质,请勿用于生产环境
Hadoop_exporter
⭐
35
A hadoop exporter for prometheus, scrape hadoop metrics (including HDFS, YARN, MAPREDUCE, HBASE. etc.) from hadoop components jmx url.
Pyspark Algorithms
⭐
33
PySpark Algorithms Book: https://www.amazon.com/dp/B07X4B2218/ref=sr_1_2
Emrio
⭐
30
Elastic MapReduce instance optimizer
Webarchive Indexing
⭐
30
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Compute Hadoop Java Python
⭐
28
This software demonstrates one way to create and manage a cluster of Hadoop nodes running on Google Compute Engine.
Scipy2013
⭐
27
SciPy 2013 Data Processing Tutorial
Mongo Hive
⭐
24
Load your MongoDB collection into Hive. Supports complex JSON structure.
Redisgears Py
⭐
23
RedisGears python client
Learn Hadoop And Spark
⭐
22
This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.
Mrnmf
⭐
22
Nonnegative matrix factorizations in MapReduce
Dancedeets Monorepo
⭐
22
DanceDeets Codebase: The python server (with React.JS rendering), as well as the React Native mobile app (and their shared code)
Coursera Uw Machine Learning Clustering Retrieval
⭐
21
Mining Frequent Pattern From Search History
⭐
19
《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。
Pymapreduce
⭐
19
Simple MapReduce implementation in Python, for text file parallel processing
Ceteri Mapred
⭐
19
MapReduce examples
Spark And Mllib Projects
⭐
18
This repository contains Spark, MLlib, PySpark and Dataframes projects
Hadoop And Swift Integration
⭐
18
API to run Hadoop MapReduce programs over Swift
Mrtsqr
⭐
18
MapReduce Streaming TSQR Implementation
Hadoop Python Tutorial
⭐
18
Exercises and examples developed for the Hadoop with Python tutorial
Disco Slct
⭐
17
A mapreduce implementation of SLCT (http://ristov.users.sourceforge.net/slct/) using Disco.
Sf Python Meetup Sep 2013
⭐
17
Deliroll presentation given at the SF Python Meetup on September 11th 2013
Real_time_social_media_mining
⭐
16
DevOps pipeline for Real Time Social/Web Mining
Mrlin
⭐
16
mrlin is 'MapReduce processing of Linked Data' … because it's magic
Spark
⭐
16
There are Python 2.7 codes and learning notes for Spark 2.1.1
Cs205_ga
⭐
16
How deep does Google Analytics go? Efficiently tackling Common Crawl using AWS & MapReduce
Hadoop Mapreduce
⭐
16
Collection of example and notes on Hadoop and Map Reduce
Mongo Bigquery
⭐
16
Load your MongoDB collection into Google BigQuery. Supports complex JSON structure.
Udacity_hadoop_intro
⭐
15
Notes and tasks code for Cloudera / Udacity hadoop course
Pyspark
⭐
15
spark (scala and python)
Bigquery Appengine Datastore Import Sample
⭐
15
Demonstrates how to extract and transform data from the Datastore into a format suitable for ingestion by Google BigQuery, via the App Engine MapReduce library.
Pairwise Mapreduce
⭐
15
Implementation of a pairwise document similarity algorithm using MapReduce.
Intro To Hadoop Mapreduce
⭐
14
Twittercommunitydetection
⭐
14
Community Detection for Twitter follower network of 40 million users using mapreduce
Hadoop Mapreduce Python Example
⭐
13
Map Reduce example for Hadoop in Python based on Udacity: Intro to Hadoop and MapReduce
Yarn Memory Calculator
⭐
12
Hadoop YARN & MapReduce Memory Calculator
Ysmart
⭐
12
Mirror of YSmart
Python Performance
⭐
12
Repository for the book Fast Python - published by Manning
Fdic Call Reports
⭐
11
Tools for analysis and review of FDIC call report data
Dijkstra Hadoop Spark
⭐
10
Dijkstra Algorithm - Python Hadoop Streaming and Pyspark
Emr Demo
⭐
10
Project files for the post: Running PySpark Applications on Amazon EMR: Methods for Interacting with PySpark on Amazon Elastic MapReduce.
Cloudstoragelineinputreader
⭐
10
An appengine-mapreduce line input reader for Google Cloud Storage
Prince
⭐
10
Extra-light API for using Hadoop with Python
Sentiment Analysis
⭐
10
Distributed sentiment analysis on GitHub commit comments
Pyrallel
⭐
10
Yet another easy-to-use python3 parallel library for humans.
Cc Mrjob
⭐
9
Demonstration of using Python to process the Common Crawl dataset with the mrjob framework
Mapreduce
⭐
9
MapReduce scripts written in Python for Hadoop Streaming
Tf Idf Implementation Using Map Reduce Hadoop Python
⭐
9
Appengine Mapreduce Utils
⭐
9
Extra utils for http://code.google.com/p/appengine-mapreduce/
Academicrecommendation_dlut
⭐
9
面向大连理工大学学者,基于大连理工大学机构知识库,推荐潜在主题词,优化科研方向,促进跨学部、学科、领
Ascii_dna_translator
⭐
9
DNA as an information storage medium using either 4-base codon or binary representations of 256-ASCII
Hadoop And Mapreduce
⭐
8
Udacity-Intro to Hadoop and MapReduce-Part 1
Inforetrieval
⭐
8
Inverted Indexer, web crawler, sort, search and poster steamer written using Python for information retrieval.
Python2 Course
⭐
8
Python2.7教程
Hadoop Mapreduce Examples Python
⭐
8
All the Hadoop Mapreduce examples in python!
Common_crawl
⭐
8
Simple Python MapReduce jobs for processing the Common Crawl plus command-line utilities
Implementation Of Mapreduce Algorithms Using A Simple Python Mapreduce Framework
⭐
8
Implements common data processing tasks such as creation of an inverted index, performing a relational join, multiplying sparse matrices and dna-sequence trimming using a simple MapReduce model, on a single machine in python.
Shepherdpy
⭐
8
A companion library for mincemeatpy that will manage MapReduce clients.
Wiki_pagerank
⭐
7
Playing with the wikipedia link graph
Airflow Plugin
⭐
7
Plugin for Apache Airflow to execute serverless tasks using Lithops
Vector Space Model Of Information Retrieval
⭐
7
Search Engine in MapReduce framework
Phasis
⭐
7
Suite for phased clusters discovery, comparison, annotation and to identify miRNA triggers - uses MapReduce model [In review]
Rasppi Cluster
⭐
7
An efficient quick-start tool to build a Raspberry Pi (or Debian-based) Cluster with popular ecosystem like Hadoop, Spark
Canopyclusteringpython
⭐
7
Canopy Clustering using MapReduce [Hadoop]
Map_reduce Ntua
⭐
6
Lab exercise of Advanced Topics in Database Systems course in NTUA regarding Map Reduce
Related Searches
Python Script (17,004)
Python Dataset (14,792)
Python Machine Learning (14,099)
Python Tensorflow (13,736)
Python Command Line (13,187)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Algorithms (9,749)
Python Amazon Web Services (8,185)
Python Keras (6,661)
1-100 of 204 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2023 Awesome Open Source. All rights reserved.