Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python hadoop
hadoop
x
python
x
295 search results found
Webarchive Indexing
⭐
30
Tools for bulk indexing of WARC/ARC files on Hadoop, EMR or local file system.
Gads7
⭐
28
GA Data Science NY Section 7
Eggo
⭐
28
Ready-to-go Parquet-formatted public 'omics datasets
Compute Hadoop Java Python
⭐
28
This software demonstrates one way to create and manage a cluster of Hadoop nodes running on Google Compute Engine.
Big Data Analytics
⭐
27
Big Data Analytics, published by Packt
Pydatapreprocessing
⭐
26
《Python数据预处理技术与实践》源码下载
Clusterdock
⭐
26
clusterdock is a framework for creating Docker-based container clusters
Pixelmor
⭐
25
Big Pixel Data for image/video processing.
Lagouspider
⭐
25
拉钩职位爬虫
Real_time_social_media_mining
⭐
24
DevOps pipeline for Real Time Social/Web Mining
Springboard Data Science Immersive
⭐
23
Training
⭐
23
Linux、Python、自动化运维、Docker、大数据技术培训(第一期)
Adherer
⭐
23
Computation of adherence to medications from Electronic Healthcare Data in R
Hadoop_jmx_exporter
⭐
23
HDFS & YARN jmx metrics prometheus exporter
Docker Hive On Tez
⭐
22
Docker image for Apache Hive running on Tez
Hpchadoop
⭐
22
Hadoop for Traditional HPC Users
Hdfscontents
⭐
22
A HDFS-backed ContentsManager implementation for IPython
Uba
⭐
22
UEBA Solution for Insider Security. This repo is archived. Thanks!
Learn Hadoop And Spark
⭐
22
This repository focuses on gathering and making a curated list resources to learn Hadoop for FREE.
Hadoop Test Cluster
⭐
20
Dockerized setup for testing code on realistic hadoop clusters
Snakebite Py3
⭐
20
Pure python HDFS client: python3.x version
Tensoronspark
⭐
20
Running Tensorflow on Spark in the scalable, fast and compatible style
Collocations
⭐
20
bigram / trigram analysis of wikipedia; mainly mutual info
Mining Frequent Pattern From Search History
⭐
19
《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。
Guineapig
⭐
19
Pure python PIG-like language
Layer Index
⭐
19
Index of layers for building charms
Yarnspawner
⭐
19
Spawn JupyterHub single user notebook servers in Hadoop/YARN containers.
Pythonsparkmlbookclub
⭐
19
Ceteri Mapred
⭐
19
MapReduce examples
Data Science Ebooks
⭐
19
Data Science E-books, Interview Resources and Cheat-sheets
Sandcrawler
⭐
19
Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki
Mrtsqr
⭐
18
MapReduce Streaming TSQR Implementation
Multidim
⭐
18
Visualising Multi Dimensional Data
Ansible Hdfs
⭐
18
An Ansible role for configuring HDFS
Django_hadoop
⭐
18
Hadoop integration for Django
Spark Notes
⭐
18
Note anything during writing spark or scala
Hadoop And Swift Integration
⭐
18
API to run Hadoop MapReduce programs over Swift
Hive_to_es
⭐
18
同步Hive数据仓库数据到Elasticsearch的小工具
Wikipediaphilosophy
⭐
18
do all first links on wikipedia _really_ lead to philosophy?
Hadoop Python Tutorial
⭐
18
Exercises and examples developed for the Hadoop with Python tutorial
Hive Presto Docker
⭐
18
Hadoop, Hive and PrestoDB for deployment using Docker
Oci Cloudera
⭐
18
Terraform module to deploy Cloudera on Oracle Cloud Infrastructure (OCI)
Redstack
⭐
17
REDstack - Hadoop as a service on OpenStack
Dbimport
⭐
17
DBImport ingestion tool. Handle import, export and standard ETL flows in Hadoop/Hive
Flink Service Discovery
⭐
17
Discover Flink clusters on Hadoop YARN for Prometheus
Jumbo
⭐
17
🐘 A local Hadoop cluster bootstrapper using Vagrant, Ansible, and Ambari.
2012 Naward13
⭐
16
Kavetoolbox
⭐
16
Data analytics toolkit part of the KAVE, installable stand-alone
Cs205_ga
⭐
16
How deep does Google Analytics go? Efficiently tackling Common Crawl using AWS & MapReduce
Bigdata_learning
⭐
16
大数据组件学习代码
Hadoop Mapreduce
⭐
16
Collection of example and notes on Hadoop and Map Reduce
Haatkit
⭐
16
Toolkit of simple scripts useful for managing Hadoop
Bidmach_spark
⭐
16
Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).
Mrlin
⭐
16
mrlin is 'MapReduce processing of Linked Data' … because it's magic
Udacity_hadoop_intro
⭐
15
Notes and tasks code for Cloudera / Udacity hadoop course
Gptools For Aws
⭐
15
GP Tools for Amazon Web Services Elastic Map Reduce (Hosted Hadoop Framework)
Hadoop_exporter
⭐
15
Exports hadoop metrics via HTTP for Prometheus consumption
Rastercube
⭐
15
rastercube is a python library for big data analysis of georeferenced time series data (e.g. MODIS NDVI)
Interview Notes
⭐
15
有关Python、大数据、MySQL的总结
Kpi Stuff
⭐
15
Some of my laboratories work in KPI and stuff connected with it.
Smoke
⭐
14
Run Spark jobs interactively from the web
Pyspark K8s Example
⭐
14
Edgar Oil Contracts
⭐
14
Ming the SEC's EDGAR system for oil contracts.
Athena
⭐
14
Interact with your Hadoop cluster from the convenience of your local command line.
Course 2016 2017 2st
⭐
14
2016~2017学年,第二学期,课程汇总
Rasppi Cluster
⭐
14
An efficient quick-start tool to build a Raspberry Pi (or Debian-based) Cluster with popular ecosystem like Hadoop, Spark
Intro To Hadoop Mapreduce
⭐
14
Keyphrase
⭐
14
Key phrase extraction using Hadoop + Dumbo + NLTK
Imageserver
⭐
14
分布式图片服务器,基于HDFS、HBASE/Redis、nginx etc
Twittercommunitydetection
⭐
14
Community Detection for Twitter follower network of 40 million users using mapreduce
Hanythingondemand
⭐
13
hanythingondemand provides a set of scripts to easily set up an ad-hoc Hadoop cluster through PBS jobs
Cobra Policytool
⭐
13
Manage Apache Atlas and Ranger configuration for your Hadoop environment.
Dataanalysis_cases
⭐
13
「数据分析师」项目练习、参考资料
Isilon_hadoop_tools
⭐
13
Tools for Using Hadoop with OneFS
Pocs
⭐
13
poc
Hiddenattributemodels
⭐
13
HAM
Datafabric_splunk
⭐
13
Hadoop Mapreduce Python Example
⭐
13
Map Reduce example for Hadoop in Python based on Udacity: Intro to Hadoop and MapReduce
Chelmbigstock
⭐
13
Study group for Hadoop, Python and analytics
Recsystem
⭐
12
一个网站,一个推荐系统
Sdc Api Tool
⭐
12
A set of utilities to help with management of Streamsets pipelines.
Docker Hadoop Spark
⭐
12
Dockerised Spark running on YARN
Nyc_taxi_pipeline
⭐
12
Design/Implement stream/batch architecture on NYC taxi data | #DE
Yarn Memory Calculator
⭐
12
Hadoop YARN & MapReduce Memory Calculator
Libsvm Hadoop
⭐
12
Ysmart
⭐
12
Mirror of YSmart
Bento Cluster
⭐
11
A zero-configuration Hadoop and HBase micro-cluster included in the Kiji BentoBox distribution
Docker Registry Driver Hdfs
⭐
11
HDFS driver for the docker-registry
Luigi Demo
⭐
11
A vagrant demo for luigi.
Saga Hadoop
⭐
11
Tool for spawning Hadoop Cluster on HPC infrastructures
Awsscripts
⭐
11
Scripts for making Hadoop deployments in AWS easy
Py Hadoop Tutorial
⭐
11
Source Material for using Python and Hadoop together
Cca175 Exam Preparation
⭐
11
Cloudera CCA175 Spark and Hadoop Developer exam preparation
Ga Dat 08
⭐
11
GA Data Science
Census
⭐
10
Python package for U.S. Census and American Community Survey
Ls Thrift Py Hadoop
⭐
10
Prince
⭐
10
Extra-light API for using Hadoop with Python
Coursera Hadoop Platform And Application Framework
⭐
10
Assignments for UC San Diego's Hadoop Platform and Application Framework class on Coursera
Ranger_modules
⭐
10
A set of modules aimed to manipulate policies on Apache Ranger.
Anaconda
⭐
10
python gift package
Related Searches
Python Dataset (14,792)
Python Docker (14,113)
Python Machine Learning (14,099)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Html (10,924)
Python Artificial Intelligence (8,580)
Python Amazon Web Services (7,946)
Python Pytorch (7,877)
101-200 of 295 search results
< Previous
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.