Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Prefect | 12,915 | 1 | 138 | 6 hours ago | 225 | August 01, 2023 | 565 | apache-2.0 | Python | |
Prefect is a workflow orchestration tool empowering developers to build, observe, and react to data pipelines | ||||||||||
Tpot | 9,213 | 40 | 20 | 23 days ago | 61 | January 06, 2021 | 281 | lgpl-3.0 | Python | |
A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming. | ||||||||||
Great_expectations | 8,855 | 35 | 6 hours ago | 236 | August 04, 2023 | 143 | apache-2.0 | Python | ||
Always know what to expect from your data. | ||||||||||
Dagster | 8,548 | 41 | 6 hours ago | 105 | September 30, 2022 | 2,024 | apache-2.0 | Python | ||
An orchestration platform for the development, production, and observation of data assets. | ||||||||||
Pachyderm | 5,979 | 1 | 13 hours ago | 504 | August 04, 2023 | 882 | apache-2.0 | Go | ||
Data-Centric Pipelines and Data Versioning | ||||||||||
Mage Ai | 5,572 | 6 hours ago | 278 | August 08, 2023 | 140 | apache-2.0 | Python | |||
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data. | ||||||||||
Orchest | 3,876 | 4 months ago | 19 | December 13, 2022 | 125 | apache-2.0 | TypeScript | |||
Build data pipelines, the easy way 🛠️ | ||||||||||
Datascienceresources | 3,826 | a month ago | 20 | |||||||
Open Source Data Science Resources. | ||||||||||
Polyaxon | 3,387 | 4 | 12 | a day ago | 377 | August 14, 2023 | 122 | apache-2.0 | ||
MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle | ||||||||||
Pipelines | 3,293 | 2 | 71 | 21 hours ago | 125 | July 28, 2023 | 1,043 | apache-2.0 | Python | |
Machine Learning Pipelines for Kubeflow |
Prefect is an orchestrator for data-intensive workflows. It's the simplest way to transform any Python function into a unit of work that can be observed and orchestrated. With Prefect, you can build resilient, dynamic workflows that react to the world around them and recover from unexpected changes. With just a few decorators, Prefect supercharges your code with features like automatic retries, distributed execution, scheduling, caching, and much more. Every activity is tracked and can be monitored with the Prefect server or Prefect Cloud dashboard.
from prefect import flow, task
from typing import List
import httpx
@task(retries=3)
def get_stars(repo: str):
url = f"https://api.github.com/repos/{repo}"
count = httpx.get(url).json()["stargazers_count"]
print(f"{repo} has {count} stars!")
@flow(name="GitHub Stars")
def github_stars(repos: List[str]):
for repo in repos:
get_stars(repo)
# run the flow!
github_stars(["PrefectHQ/Prefect"])
After running some flows, fire up the Prefect UI to see what happened:
prefect server start
From here, you can continue to use Prefect interactively or deploy your flows to remote environments, running on a scheduled or event-driven basis.
Prefect requires Python 3.8 or later. To install Prefect, run the following command in a shell or terminal session:
pip install prefect
Start by then exploring the core concepts of Prefect workflows, then follow one of our friendly tutorials to learn by example.
Prefect is made possible by the fastest growing community of thousands of friendly data engineers. Join us in building a new kind of workflow system. The Prefect Slack community is a fantastic place to learn more about Prefect, ask questions, or get help with workflow design. The Prefect Discourse is a community-driven knowledge base to find answers to your Prefect-related questions. All community forums, including code contributions, issue discussions, and slack messages are subject to our Code of Conduct.
See our documentation on contributing to Prefect.
Thanks for being part of the mission to build a new kind of workflow system and, of course, happy engineering!