Awesome Open Source
Search
Programming Languages
Languages
All Categories
Categories
About
Search results for python hdfs
hdfs
x
python
x
132 search results found
Cat
⭐
18,237
CAT 作为服务端项目基础组件,提供了 Java, C/C++, Node.js, Python, Go 等多语言客户端,已经在美团点评的基础架构中间件框架(MVC框架,RPC框架,数据库框架,缓存框架等,
Tensorflowonspark
⭐
3,851
TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Ibis
⭐
3,404
The flexibility of Python with the scale and performance of modern SQL.
Smart_open
⭐
3,065
Utils for streaming large files (S3, HDFS, gzip, bz2...)
Hopsworks
⭐
1,041
Hopsworks - Data-Intensive AI platform with a Feature Store
Snakebite
⭐
854
A pure python HDFS client
Devops Python Tools
⭐
709
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
Daily Deeplearning
⭐
532
🔥机器学习/深度学习/Python/算法面试/自然语言处理教程/剑指offer/machine learning/deeplearning/Python/Algorithm interview/NLP Tutorial
Minos
⭐
508
Minos is beyond a hadoop deployment system.
Deeplog
⭐
320
Pytorch Implementation of DeepLog.
Packetpig
⭐
309
Packetpig - Open Source Big Data Security Analytics
Tensorspark
⭐
302
TensorFlow on Spark
Hdfs
⭐
257
API and command line interface for HDFS
Pysparkling
⭐
253
A pure Python implementation of Apache Spark's RDD and DStream interfaces.
Omniduct
⭐
247
A toolkit providing a uniform interface for connecting to and extracting data from a wide variety of (potentially remote) data stores (including HDFS, Hive, Presto, MySQL, etc).
Hadoopy
⭐
244
Python MapReduce library written in Cython. Visit us in #hadoopy on freenode. See the link below for documentation and tutorials.
Wradlib
⭐
238
weather radar data processing - python package
Hadoop Attack Library
⭐
200
A collection of pentest tools and resources targeting Hadoop environments
Tiledb Py
⭐
167
Python interface to the TileDB storage engine
Juicy Bigdata
⭐
162
🎉🎉🐳 Datawhale大数据处理导论教程 | 大数据技术方向的开篇课程🎉🎉
Ipython Spark Docker
⭐
151
Portainer
⭐
130
Apache Mesos framework for building Docker images on a cluster of machines
Plsc
⭐
129
Paddle Large Scale Classification Tools,supports ArcFace, CosFace, PartialFC, Data Parallel + Model Parallel. Model includes ResNet, ViT, Swin, DeiT, CaiT, FaceViT, MoCo, MAE, ConvMAE, CAE.
Skein
⭐
126
A tool and library for easily deploying applications on Apache YARN
Megfile
⭐
99
Megvii FILE Library - Working with Files in Python same as the standard library
Spark With Python
⭐
98
Fundamentals of Spark with Python (using PySpark), code examples
Big Data Engineering Coursera Yandex
⭐
91
Big Data for Data Engineers Coursera Specialization from Yandex
Pyhdfs
⭐
88
Python HDFS client
Python Hdfs
⭐
67
HDFS client for Python
Mylearningnotes
⭐
58
Because its never late to start taking notes and 'public' it...
Pfio
⭐
51
IO library to access various filesystems with unified API
Cluster Pack
⭐
44
A library on top of either pex or conda-pack to make your Python code easily available on a cluster
Pydfs
⭐
44
Tiny distributed file system like HDFS (and of-course GFS)
Spark Cluster Deployment
⭐
43
Automates Spark standalone cluster tasks with Puppet and Fabric.
Datashark
⭐
41
dataShark is a Security & Network Event Analytics Framework built on Apache Spark
Hadoop Multi Server Ansible
⭐
38
Multi-server deployment of Hadoop using Ansible
Blueking Dbm
⭐
37
DBM,数据库管理
Hadoop_exporter
⭐
35
A hadoop exporter for prometheus, scrape hadoop metrics (including HDFS, YARN, MAPREDUCE, HBASE. etc.) from hadoop components jmx url.
Opendataplatform
⭐
34
An open source, enterprise-scale, vendor-neutral data platform accelerating solution delivery.
Jydoop
⭐
32
Efficient Hadoop Map-Reduce in Python
Compute Hadoop Java Python
⭐
28
This software demonstrates one way to create and manage a cluster of Hadoop nodes running on Google Compute Engine.
Glm Parser
⭐
26
Tree-adjoining grammar based statistical dependency parser using a general linear model (glm).
Fablinker
⭐
25
A tool for operating multiple servers interactively. 交互式多服务器自动化运维工具,简单易用
Real_time_social_media_mining
⭐
24
DevOps pipeline for Real Time Social/Web Mining
Mongo Hive
⭐
24
Load your MongoDB collection into Hive. Supports complex JSON structure.
Psd2svg
⭐
24
PSD to SVG converter.
Hadoop_jmx_exporter
⭐
23
HDFS & YARN jmx metrics prometheus exporter
Hdfscontents
⭐
22
A HDFS-backed ContentsManager implementation for IPython
Mrnmf
⭐
22
Nonnegative matrix factorizations in MapReduce
Whakapai
⭐
22
Various Python Data Science Projects available in PyPi
Spark Yarn Rest Api
⭐
20
Demonstrates how to submit a job to Spark on HDP directly via YARN's REST API from any workstation
Feature_engineering
⭐
20
(Under Development) Extract features from text and links. Useful for machine learning algorithms.
Tensoronspark
⭐
20
Running Tensorflow on Spark in the scalable, fast and compatible style
Snakebite Py3
⭐
20
Pure python HDFS client: python3.x version
Mining Frequent Pattern From Search History
⭐
19
《大数据挖掘技术》@复旦 课程项目,试图从搜狗实验室用户查询日志数据(2008)中找出搜索记录中有较高支持度关键词的频繁二项集 Hadoop 集群,并且用 Python 实现了 Parallel FP-Growth 算法中的三个 MapReduce 过程。
Django_hadoop
⭐
18
Hadoop integration for Django
Hive Presto Docker
⭐
18
Hadoop, Hive and PrestoDB for deployment using Docker
Hive_to_es
⭐
18
同步Hive数据仓库数据到Elasticsearch的小工具
Spark Notes
⭐
18
Note anything during writing spark or scala
Ansible Hdfs
⭐
18
An Ansible role for configuring HDFS
Kiji Express
⭐
17
Spark Cnn
⭐
16
CS848 Final Project (using spark to speed up CNN)
Ambari Drill Service
⭐
16
Ambari service for Apache Drill
Hive_compared_bq
⭐
16
hive_compared_bq compares/validates 2 (SQL like) tables, and graphically shows the rows/columns that are different.
Bidmach_spark
⭐
16
Code to allow running BIDMach on Spark including HDFS integration and lightweight sparse model updates (Kylix).
Udacity_hadoop_intro
⭐
15
Notes and tasks code for Cloudera / Udacity hadoop course
Bigkube
⭐
14
Minikube for big data with Scala and Spark
Imageserver
⭐
14
分布式图片服务器,基于HDFS、HBASE/Redis、nginx etc
Bigdata_docker
⭐
13
Big Data Docker Data Science Spark Spark3 Hadoop HDFS Scala Python Artificial Intelligence Machine Learning Jupyter Lab Notebook
Hdfs Geohex
⭐
13
(Web)Mapping Elephants with Sparks
Robot
⭐
13
Robot is a framework base on flink below v1.5, serve for 'oshit team', develop in python3.6。
Isilon_hadoop_tools
⭐
13
Tools for Using Hadoop with OneFS
Pystream
⭐
13
Stream backups directly to/from S3/HDFS without wasting disk space during the process
Aurora Redis
⭐
12
Oozie Pyspark Workflow
⭐
12
Example of an Oozie workflow with a PySpark action using Python eggs
Cmsspark
⭐
12
General purpose framework to run CMS experiment workflows on HDFS/Spark platform
Ysmart
⭐
12
Mirror of YSmart
Docker Registry Driver Hdfs
⭐
11
HDFS driver for the docker-registry
Hive_merge
⭐
11
Merge Small files for Hive Table on HDFS
Fm
⭐
11
using FM latent vectors as embedding features
Bento Cluster
⭐
11
A zero-configuration Hadoop and HBase micro-cluster included in the Kiji BentoBox distribution
Cca175 Exam Preparation
⭐
11
Cloudera CCA175 Spark and Hadoop Developer exam preparation
Ambarielasticsearch
⭐
10
ElasticSearch Custom Service For Installation using Ambari
Bigdata20180301
⭐
10
巨量資料導論 上課資料
Ranger_modules
⭐
10
A set of modules aimed to manipulate policies on Apache Ranger.
Thrive
⭐
10
Thrive is an ETL framework that runs single-row transformations on HDFS data and makes the data available in relational databases (Hive and Vertica).
Memsql Loader
⭐
9
Deprecated - Check out MemSQL Pipelines instead!
Django Hdfs
⭐
9
HFDS interface utilities of Django including file storage.
Filemerge
⭐
9
Filemerge is a utility for merging a large number of small HDFS files into smaller number of large files. Filemerge is intended for use by Hadoop operations engineers and map-reduce application developers.
Pydistcp
⭐
9
A python Web HDFS based tool for inter/intra-cluster data copying.
Hadoop Mapreduce Examples Python
⭐
8
All the Hadoop Mapreduce examples in python!
Netlytics
⭐
8
NetLytics is a Hadoop-powered framework for performing advanced analytics on various kinds of networks logs
Spark Ec2 Setup
⭐
8
Hadoop Spark Vagrant Ansible
⭐
8
Geotrellis Ec2 Cluster
⭐
8
Scripts to deploy a GeoTrellis Spark cluster on EC2
Pysqoop
⭐
8
A python package that lets you sqoop into HDFS data from RDBMS using sqoop
Hadoop And Mapreduce
⭐
8
Udacity-Intro to Hadoop and MapReduce-Part 1
Theft Market
⭐
7
Infrastructure for analyzing historical real estate data
Nyc 311 Data Analytics
⭐
7
Hive, Python, Tableau and more..
Hdfs Test
⭐
7
Related Searches
Python Docker (14,113)
Python Machine Learning (14,099)
Python Tensorflow (13,736)
Python Deep Learning (13,092)
Python Jupyter Notebook (12,976)
Python Pytorch (7,877)
Python Amazon Web Services (7,633)
Python Testing (7,358)
Python Pandas (6,193)
Python Classification (6,150)
1-100 of 132 search results
Next >
Privacy
|
About
|
Terms
|
Follow Us On Twitter
Copyright 2018-2024 Awesome Open Source. All rights reserved.