Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Pyspark Example Project | 1,034 | a year ago | 11 | Python | ||||||
Example project implementing best practices for PySpark ETL jobs and applications. | ||||||||||
Butterfree | 269 | 1 | 4 months ago | 35 | November 14, 2023 | 6 | apache-2.0 | Python | ||
A tool for building feature stores. | ||||||||||
Apachespark | 59 | a year ago | Python | |||||||
This repository will help you to learn about databricks concept with the help of examples. It will include all the important topics which we need in our real life experience as a data engineer. We will be using pyspark & sparksql for the development. At the end of the course we also cover few case studies. | ||||||||||
Datapipelines Essentials Python | 45 | 10 months ago | 1 | apache-2.0 | Python | |||||
Simplified ETL process in Hadoop using Apache Spark. Has complete ETL pipeline for datalake. SparkSession extensions, DataFrame validation, Column extensions, SQL functions, and DataFrame transformations | ||||||||||
Basin | 29 | a year ago | 42 | other | TypeScript | |||||
Basin is a visual programming editor for building Spark and PySpark pipelines. Easily build, debug, and deploy complex ETL pipelines from your browser | ||||||||||
Python_mozetl | 26 | 2 months ago | 23 | mit | Python | |||||
ETL jobs for Firefox Telemetry | ||||||||||
Spark Movies Etl | 21 | 7 months ago | 2 | Python | ||||||
Spark data pipeline that ingests and transforms movie ratings data. | ||||||||||
Sparklanes | 16 | 1 | 4 years ago | 5 | January 31, 2019 | 2 | mit | Python | ||
A lightweight data processing framework for Apache Spark | ||||||||||
Lineage | 14 | 2 | 2 years ago | 11 | January 26, 2022 | apache-2.0 | TypeScript | |||
Generate beautiful documentation for your data pipelines in markdown format | ||||||||||
Birgitta | 12 | a year ago | 34 | September 10, 2020 | 20 | mit | Python | |||
Birgitta is a Python ETL test and schema framework, providing automated tests for pyspark notebooks/recipes. |