Data Pipeline Alternatives

Name: GoogleCloudPlatform/Data-Pipeline
Brand: GoogleCloudPlatform/Data-Pipeline
SKU: project/GoogleCloudPlatform/Data-Pipeline
Rating: 4.45 (79 reviews)

Data pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs. Out of the box it will load files from a source, transform them and then output them (output might be writing to a file or loading them into a data analysis tool). It is designed to be modular and support various sources, transformation technologies and output types. The transformations can be chained together to form complex pipelines.

Categories > Data Processing > Google

Suggest Alternative

Stars

Alternatives

License

apache-2.0

Open Issues

Most Recent Commit

over 12 years ago

Programming Language

Python

Dependent Repos

Dependent Packages

Total Releases

Categories

Programming Languages > Python

Companies > Google

Cloud Computing > Cloud Computing

Data Processing > Pipeline

Data Processing > Hadoop

Data Processing > Bigquery

Repo

Alternatives To GoogleCloudPlatform/Data-Pipeline

Project Name	Stars	Repos Using This	Packages Using This	Most Recent Commit	Total Releases	Latest Release	Open Issues	License	Language
GoogleCloudPlatform/professional-services	2,635	0	0	over 2 years ago	0		41	apache-2.0	Python
Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.
GoogleCloudPlatform/DataflowJavaSDK	853	249	14	over 5 years ago	38	June 26, 2018	54
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
ahangchen/GoogleML	203	0	0	about 5 years ago	0		11		Python
Google机器学习教程笔记（基础版）
GoogleCloudPlatform/DataflowPythonSDK	157	0	0	about 9 years ago	0		20
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.
GoogleCloudPlatform/DataflowSDK-examples	148	0	0	almost 8 years ago	0		5
Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. This repository hosts a few example pipelines to get you started with Dataflow.
wolterlw/hand_tracking	113	0	0	over 5 years ago	0		5	apache-2.0	Python
Minimal Python interface for Google's Mediapipe HandTracking pipeline
GoogleCloudPlatform/kubernetes-bigquery-python	106	0	0	over 5 years ago	0		5	apache-2.0	Python
Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub
GoogleCloudPlatform/Data-Pipeline	79	0	0	over 12 years ago	0		2	apache-2.0	Python
Data pipeline is a tool to run Data loading pipelines. It is an open sourced app engine app that users can extend to suit their own needs. Out of the box it will load files from a source, transform them and then output them (output might be writing to a file or loading them into a data analysis tool). It is designed to be modular and support various sources, transformation technologies and output types. The transformations can be chained together to form complex pipelines.
bomboradata/pubsub-to-bigquery	64	0	0	about 8 years ago	0		0	apache-2.0	Java
A highly configurable Google Cloud Dataflow pipeline that writes data into Google Big Query table from Pub/Sub
GoogleCloudPlatform/continuous-deployment-bitbucket	60	0	0	about 6 years ago	0		1	apache-2.0	Python

Alternatives To GoogleCloudPlatform/Data-Pipeline

Select To Compare

GoogleCloudPlatform/professional-services ⭐ 2,635

Common solutions and tools developed by Google Cloud's Professional Services team. This repository and its contents are not an officially supported Google product.

dependent packages 0 total releases 0 most recent commit over 2 years ago

GoogleCloudPlatform/DataflowJavaSDK ⭐ 853

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

dependent packages 14 total releases 38 most recent commit over 5 years ago

ahangchen/GoogleML ⭐ 203

Google机器学习教程笔记（基础版）

dependent packages 0 total releases 0 most recent commit about 5 years ago

GoogleCloudPlatform/DataflowPythonSDK ⭐ 157

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

dependent packages 0 total releases 0 most recent commit about 9 years ago

GoogleCloudPlatform/DataflowSDK-examples ⭐ 148

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines. This repository hosts a few example pipelines to get you started with Dataflow.

dependent packages 0 total releases 0 most recent commit almost 8 years ago

wolterlw/hand_tracking ⭐ 113

Minimal Python interface for Google's Mediapipe HandTracking pipeline

dependent packages 0 total releases 0 most recent commit over 5 years ago

GoogleCloudPlatform/kubernetes-bigquery-python ⭐ 106

Example Kubernetes app that shows how to build a 'pipeline' to stream data into BigQuery. Uses Redis or Google Cloud PubSub

dependent packages 0 total releases 0 most recent commit over 5 years ago

GoogleCloudPlatform/Data-Pipeline ⭐ 79

dependent packages 0 total releases 0 most recent commit over 12 years ago

bomboradata/pubsub-to-bigquery ⭐ 64

A highly configurable Google Cloud Dataflow pipeline that writes data into Google Big Query table from Pub/Sub

dependent packages 0 total releases 0 most recent commit about 8 years ago

GoogleCloudPlatform/continuous-deployment-bitbucket ⭐ 60

dependent packages 0 total releases 0 most recent commit about 6 years ago

Suggest An Alternative To Data-Pipeline

Alternative Project Comparisons

GoogleCloudPlatform/Data-Pipeline vs Professional Services

GoogleCloudPlatform/Data-Pipeline vs Dataflowjavasdk

GoogleCloudPlatform/Data-Pipeline vs Googleml

GoogleCloudPlatform/Data-Pipeline vs Dataflowpythonsdk

GoogleCloudPlatform/Data-Pipeline vs Dataflowsdk Examples

GoogleCloudPlatform/Data-Pipeline vs Hand_tracking

GoogleCloudPlatform/Data-Pipeline vs Kubernetes Bigquery Python

GoogleCloudPlatform/Data-Pipeline vs Data Pipeline

GoogleCloudPlatform/Data-Pipeline vs Pubsub To Bigquery

GoogleCloudPlatform/Data-Pipeline vs Continuous Deployment Bitbucket

Popular Pipeline Projects

apache/airflow⭐ 33,219

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

nushell/nushell⭐ 28,304

A new type of shell

vectordotdev/vector⭐ 21,215

A high-performance observability data pipeline.

jina-ai/jina⭐ 19,573

☁️ Build multimodal AI applications with cloud-native stack

spotify/luigi⭐ 17,046

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Popular Google Projects

google/material-design-icons⭐ 49,227

Material Design icons by Google

google/guava⭐ 48,993

Google core libraries for Java

rclone/rclone⭐ 42,258

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Yandex Files

Asabeneh/30-Days-Of-JavaScript⭐ 39,974

30 days of JavaScript programming challenge is a step-by-step guide to learn JavaScript programming language in 30 days. This challenge may take more than 100 days, please just follow your own pace. These videos may help too: https://www.youtube.com/channel/UC7PNRuno1rzYPb1xLa4yktw

danny-avila/LibreChat⭐ 36,936

Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active.

Popular Data Processing Categories

Jupyter Notebook

Dataset

Sql

Validation

Pipeline

Translation

Data Science

Classification

Transaction

Scraper