Kestra

Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.
Alternatives To Kestra
Project NameStarsDownloadsRepos Using ThisPackages Using ThisMost Recent CommitTotal ReleasesLatest ReleaseOpen IssuesLicenseLanguage
Airbyte10,857
9 hours ago90June 23, 20224,700otherPython
Data integration platform for ELT pipelines from APIs, databases & files to warehouses & lakes.
Dagster7,61628919 hours ago495July 06, 20221,693apache-2.0Python
An orchestration platform for the development, production, and observation of data assets.
Mage Ai4,819
14 hours ago9June 27, 202286apache-2.0Python
🧙 The modern replacement for Airflow. Build, run, and manage data pipelines for integrating and transforming data.
Orchest3,876
4 days ago14April 06, 2022125apache-2.0TypeScript
Build data pipelines, the easy way 🛠️
Kestra3,492
a day ago28August 30, 2022256apache-2.0Java
Kestra is an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.
Awesome Etl2,874
a month ago11
A curated list of awesome ETL frameworks, libraries, and software.
Mara Pipelines1,993
a month ago11June 11, 202024mitPython
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
Go Streams1,34493 months ago18February 11, 202218mitGo
A lightweight stream processing library for Go
Onepanel66914 months ago64November 15, 202185apache-2.0Go
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
Goodreads_etl_pipeline593
3 years agomitPython
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
Alternatives To Kestra
Select To Compare


Alternative Project Comparisons
Readme

Kestra workflow orchestrator

Event-driven declarative orchestrator to simplify data operations

License Commits-per-month Github star Last Version Docker pull Artifact Hub Kestra infinitely scalable orchestration and scheduling platform Slack Github discussions Twitter Code Cov Github Actions

WebsiteTwitterLinked InSlackDocumentation


modern data orchestration and scheduling platform

Live Demo

Try Kestra using our live demo.

What is Kestra

Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence.

Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate in the data pipeline creation process. The UI automatically adjusts the YAML definition any time you make changes to a workflow from the UI or via an API call. Therefore, the orchestration logic is defined declaratively in code, even if some workflow components are modified in other ways.

Adding new tasks in the UI

Key concepts

  1. Flow is the main component in Kestra. It's a container for your tasks and orchestration logic.
  2. Namespace is used to provide logical isolation, e.g., to separate development and production environments. Namespaces are like folders on your file system — they organize flows into logical categories and can be nested to provide a hierarchical structure.
  3. Tasks are atomic actions in a flow. By default, all tasks in the list will be executed sequentially, with additional customization options, a.o. to run tasks in parallel or allow a failure of specific tasks when needed.
  4. Triggers define when a flow should run. In Kestra, flows are triggered based on events. Examples of such events include:
    • a regular time-based schedule
    • an API call (webhook trigger)
    • ad-hoc execution from the UI
    • a flow trigger - flows can be triggered from other flows using a flow trigger or a subflow, enabling highly modular workflows.
    • custom events, including a new file arrival (file detection event), a new message in a message bus, query completion, and more.
  5. Inputs allow you to pass runtime-specific variables to a flow. They are strongly typed, and allow additional validation rules.

Extensible platform via plugins

Most tasks in Kestra are available as plugins, but many type of tasks are available in the core library, including a.o. script tasks supporting various programming languages (e.g., Python, Node, Bash) and the ability to orchestrate your business logic packaged into Docker container images.

To create your own plugins, check the plugin developer guide.

Rich orchestration capabilities

Kestra provides a variety of tasks to handle both simple and complex business logic, including:

  • retries
  • timeout
  • error handling
  • conditional branching
  • dynamic tasks
  • sequential and parallel tasks
  • skipping tasks or triggers when needed by setting the flag disabled to true.
  • configuring dependencies between tasks, flows and triggers
  • advanced scheduling and trigger conditions
  • backfills
  • documenting your flows, tasks and triggers by adding a markdown description to any component
  • adding labels to add additional metadata to your flows such as the flow owner or team:
id: hello  
namespace: prod
description: Hi from `Kestra` and a **markdown** description.
labels:
  owner: john-doe
  team: data-engineering
tasks:
  - id: hello
    type: io.kestra.core.tasks.log.Log
    message: Hello world!
    description: a *very* important task
    disabled: false
    timeout: 10M
    retry:
      type: constant # type: string
      interval: PT15M # type: Duration
      maxDuration: PT1H # type: Duration
      maxAttempt: 5 # type: int
      warningOnRetry: true # type: boolean, default is false
  - id: parallel
    type: io.kestra.core.tasks.flows.Parallel
    concurrent: 3
    tasks:
      - id: task1
        type: io.kestra.core.tasks.scripts.Bash
        commands:
          - 'echo "running {{task.id}}"'
          - 'sleep 10'
      - id: task2
        type: io.kestra.core.tasks.scripts.Bash
        commands:
          - 'echo "running {{task.id}}"'
          - 'sleep 10'
      - id: task3
        type: io.kestra.core.tasks.scripts.Bash
        commands:
          - 'echo "running {{task.id}}"'
          - 'sleep 10'
triggers:
  - id: schedule
    type: io.kestra.core.models.triggers.types.Schedule
    cron: "*/15 * * * *"
    backfill:
      start: 2023-06-25T14:00:00Z

Built-in code editor

You can write workflows directly from the UI. When writing your workflows, the UI provides:

  • autocompletion
  • syntax validation
  • embedded plugin documentation
  • topology view (view of your dependencies in a Directed Acyclic Graph) that get updated live as you modify and add new tasks.

Getting Started

To get a local copy up and running, follow the steps below.

Prerequisites

Make sure that Docker is installed and running on your system. The default installation requires the following:

Launch Kestra

Download the Docker Compose file:

curl -o docker-compose.yml https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml

Alternatively, you can use wget https://raw.githubusercontent.com/kestra-io/kestra/develop/docker-compose.yml.

Start Kestra:

docker-compose up

Open http://localhost:8080 in your browser and create your first flow.

Hello-World flow

Here is a simple example logging hello world message to the terminal:

id: hello  
namespace: prod
tasks:
  - id: hello-world
    type: io.kestra.core.tasks.log.Log
    message: Hello world!

For more information:

Plugins

Kestra is built on a plugin system. You can find your plugin to interact with your provider; alternatively, you can follow these steps to develop your own plugin.

For a full list of plugins, check the plugins page.

Here are some examples of the available plugins:

Airbyte Amazon S3 Avro
Azure Blob Storage Bash Big Query
CSV Cassandra ClickHouse
DBT Debezium MYSQL Debezium Postgres
Debezium Microsoft SQL Server DuckDb ElasticSearch
Fivetran Email FTP
FTPS Google Cloud Storage Google Drive
Google Sheets Groovy Http
JSON Jython Kafka
Kubernetes MQTT Microsoft SQL Server
MongoDb MySQL Nashorn
Node Open PGP Oracle
Parquet Apache Pinot Postgres
Power BI Apache Pulsar Python
Redshift Rockset SFTP
ServiceNow Singer Slack
Snowflake Soda Spark
Tika Trino Vectorwise
XML Vertex AI Vertica

This list is growing quickly and we welcome contributions.

Community Support

If you need help or have any questions, reach out using one of the following channels:

  • GitHub discussions - useful to start a conversation that is not a bug or feature request.
  • Slack - join the community and get the latest updates.
  • Twitter - to follow up with the latest updates.

Roadmap

See the open issues for a list of proposed features (and known issues) or look at the project board.

Contributing

We love contributions, big or small. Check out our contributor guide for details on how to contribute to Kestra.

See our Plugin Developer Guide for details on developing and publishing Kestra plugins.

License

Apache 2.0 © Kestra Technologies

Popular Pipeline Projects
Popular Etl Projects
Popular Data Processing Categories
Related Searches

Get A Weekly Email With Trending Projects For These Categories
No Spam. Unsubscribe easily at any time.
Java
Pipeline
Etl
Data Engineering
Workflow Engine