Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Redpanda | 6,860 | 1 | 16 hours ago | 343 | April 25, 2021 | 1,348 | C++ | |||
Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM! | ||||||||||
Examples | 1,734 | 2 days ago | 573 | July 07, 2022 | 95 | apache-2.0 | Shell | |||
Apache Kafka and Confluent Platform examples and demos | ||||||||||
Spring Cloud Dataflow | 991 | 16 | 5 | a day ago | 71 | April 05, 2022 | 204 | apache-2.0 | Java | |
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes | ||||||||||
Stream Reactor | 935 | 1 | 2 days ago | 1 | December 27, 2018 | 86 | apache-2.0 | Scala | ||
Streaming reference architecture for ETL with Kafka and Kafka-Connect. You can find more on http://lenses.io on how we provide a unified solution to manage your connectors, most advanced SQL engine for Kafka and Kafka Streams, cluster monitoring and alerting, and more. | ||||||||||
Tigris | 787 | a day ago | 69 | October 26, 2022 | 48 | apache-2.0 | Go | |||
Tigris is an Open Source Serverless NoSQL Database and Search Platform. | ||||||||||
Gb28181.solution | 497 | a month ago | 2 | other | C# | |||||
Linux/Win/Docker/kubernetes/Chart/Kustomize/GB28181/SIP/RTP/SDP/WebRTC/作为上下级域/平台级联互联 | ||||||||||
Kubernetes 101 | 474 | a month ago | 13 | mit | HTML | |||||
Kubernetes 101 - by Jeff Geerling | ||||||||||
K8s | 334 | 15 days ago | 18 | apache-2.0 | Shell | |||||
NATS on Kubernetes with Helm Charts | ||||||||||
Airy | 330 | 2 | a month ago | 303 | April 23, 2021 | 135 | apache-2.0 | Java | ||
💬 Open Source App Framework to build streaming apps with real-time data - 💎 Build real-time data pipelines and make real-time data universally accessible - 🤖 Join historical and real-time data in the stream to create smarter ML and AI applications. - ⚡ Standardize complex data ingestion and stream data to apps with pre-built connectors | ||||||||||
Cloudflow | 323 | 7 | a month ago | 604 | October 03, 2022 | 125 | apache-2.0 | Scala | ||
Cloudflow enables users to quickly develop, orchestrate, and operate distributed streaming applications on Kubernetes. |
If you are an existing customer of Lightbend and we have not contacted, please reach out to support.
All existing contracts will be honored and assistance with migration to new tools is available.
As data pipelines become first-class citizens in microservices architectures, Cloudflow gives developers data-optimized programming abstractions and run-time tooling for Kubernetes. In a nutshell, Cloudflow is an application development toolkit comprising:
Streamlet
, the core abstraction in Cloudflow.Streamlet
(s).Streamlet
composition model driven by a blueprint
definition.sbt
plugins that are able to package your application into a deployable container.kubectl
plugin, that facilitates manual and scripted management of the application.The different parts of Cloudflow work in unison to dramatically accelerate your application development efforts, reducing the time required to create, package, and deploy an application from weeks to hours.
Basic components of a Cloudflow Application
As we discussed above, Cloudflow allows developers to quickly build and deploy distributed stream processing applications by breaking such applications into smaller stream processing units called Streamlets
.
Each Streamlet
represents an independent stream processing component that implements a self-contained stage of the application logic.
Streamlets
let you break down your application into logical pieces that communicate with each other in a streaming fashion to accomplish an end to end goal.
Streamlets
can be composed into larger systems using blueprints, which specify how Streamlets
can be connected together to form a topology.
Streamlets
expose one or more inlets and outlets that represent the data consumed and produced by the Streamlet
, respectively.
Inlets and outlets are schema-driven, ensuring that data flows are always consistent and that connections between streamlets
are compatible.
In the diagram above Streamlet 1 has one outlet which feeds data to Streamlet 2 inlet. Streamlet 1 is a component that generates data or could get its data from an external system eg. via an http request.
Then Streamlet 2 outlet feeds its data output to Streamlet 3 inlet. Streamlet 2 in this application does the actual data processing.
Streamlet 3 then may store its data to some external system.
The example described here is a minimal Cloudflow application.
The data sent between Streamlets
is safely persisted in the underlying pub-sub system, allowing for independent lifecycle management of the different components.
Streamlets
can be scaled up and down to meet the load requirements of the application.
The underlying data streams are partitioned to allow for parallelism in a distributed application execution.
The Streamlet
logic can be written using an extensible choice of streaming runtimes, such as Akka Streams and Spark.
The lightweight API exposes the raw power of the underlying runtime and its libraries while providing a higher-level abstraction for composing streamlets
and expressing data schemas.
Your code is written in your familiar API.
Applications are deployed as a whole. Cloudflow takes care of deploying the individual streamlets
and making sure connections get translated into data flowing between them at runtime.
Learn more about the Cloudflow building blocks in our Cloudflow Core Concepts.
Technologies like mobile, the Internet of Things (IoT), Big Data analytics, machine learning, and others are driving enterprises to modernize how they process large volumes of data. A rapidly growing percentage of that data is now arriving in the form of data streams. To extract value from that data as soon as it arrives, those streams require near-realtime processing. We use the term "Fast Data" to describe applications and systems that deal with such requirements.
The Fast Data landscape has been rapidly evolving, with tools like Spark, Flink, and Kafka Streams emerging from the world of large-scale data processing while projects like Reactive Streams and Akka Streams have emerged from the world of application development and high-performance networking.
The demand for availability, scalability, and resilience is forcing fast data architectures to become more like microservice architectures. Conversely, successful organizations building microservices find their data needs grow with their organization while their data sources are becoming more stream-like and more real-time. Hence, there is a unification happening between streaming data and microservice architectures.
It can be quite hard to develop, deploy, and operate large-scale microservices-based systems that can take advantage of streaming data and seamlessly integrate with systems for analytics processing and machine learning. The individual technologies are well-documented, but combining them into fully integrated, unified systems is no easy task.
Cloudflow aims to make this easier by integrating the most popular streaming frameworks into a single platform for creating and running distributed Fast Data applications on Kubernetes.