Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Data Science Ipython Notebooks | 25,668 | 7 months ago | 34 | other | Python | |||||
Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines. | ||||||||||
Mrjob | 2,584 | 112 | 2 | 2 years ago | 62 | December 15, 2021 | 211 | other | Python | |
Run MapReduce jobs on Hadoop or Amazon Web Services | ||||||||||
Devops Bash Tools | 2,224 | 3 months ago | 5 | mit | Shell | |||||
1000+ DevOps Bash Scripts - AWS, GCP, Kubernetes, Docker, CI/CD, APIs, SQL, PostgreSQL, MySQL, Hive, Impala, Kafka, Hadoop, Jenkins, GitHub, GitLab, BitBucket, Azure DevOps, TeamCity, Spotify, MP3, LDAP, Code/Build Linting, pkg mgmt for Linux, Mac, Python, Perl, Ruby, NodeJS, Golang, Advanced dotfiles: .bashrc, .vimrc, .gitconfig, .screenrc, tmux.. | ||||||||||
Nagios Plugins | 1,119 | 2 months ago | 71 | other | Python | |||||
450+ AWS, Hadoop, Cloud, Kafka, Docker, Elasticsearch, RabbitMQ, Redis, HBase, Solr, Cassandra, ZooKeeper, HDFS, Yarn, Hive, Presto, Drill, Impala, Consul, Spark, Jenkins, Travis CI, Git, MySQL, Linux, DNS, Whois, SSL Certs, Yum Security Updates, Kubernetes, Cloudera etc... | ||||||||||
Devops Python Tools | 709 | 4 months ago | 37 | mit | Python | |||||
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc. | ||||||||||
Flintrock | 627 | 4 | 5 months ago | 14 | November 27, 2023 | 36 | apache-2.0 | Python | ||
A command-line tool for launching Apache Spark clusters. | ||||||||||
Aws Glue Libs | 568 | 9 months ago | 96 | other | Python | |||||
AWS Glue Libraries are additions and enhancements to Spark for ETL operations. | ||||||||||
Data Engineering Interview Questions | 554 | 7 months ago | ||||||||
More than 2000+ Data engineer interview questions. | ||||||||||
Spark Redshift | 514 | 4 | 1 | 4 years ago | 10 | November 01, 2016 | 134 | apache-2.0 | Scala | |
Redshift data source for Apache Spark | ||||||||||
Cloudbreak | 348 | 3 months ago | 41 | apache-2.0 | Java | |||||
CDP Public Cloud is an integrated analytics and data management platform deployed on cloud services. It offers broad data analytics and artificial intelligence functionality along with secure user access and data governance features. |