Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Akhq | 2,851 | 3 days ago | 135 | apache-2.0 | Java | |||||
Kafka GUI for Apache Kafka to manage topics, topics data, consumers group, schema registry, connect and more... | ||||||||||
Schema Registry | 1,930 | 16 hours ago | 257 | other | Java | |||||
Confluent Schema Registry for Kafka | ||||||||||
Nakadi | 881 | a day ago | 1 | July 25, 2016 | 47 | mit | Java | |||
A distributed event bus that implements a RESTful API abstraction on top of Kafka-like queues | ||||||||||
Mypipe | 390 | 7 years ago | 19 | apache-2.0 | Scala | |||||
MySQL binary log consumer with the ability to act on changed rows and publish changes to different systems with emphasis on Apache Kafka. | ||||||||||
Spinaltap | 318 | 2 years ago | apache-2.0 | Java | ||||||
Change Data Capture (CDC) service | ||||||||||
Karapace | 284 | 3 days ago | 54 | apache-2.0 | HTML | |||||
Karapace - Your Apache Kafka® essentials in one tool | ||||||||||
Kafka Connect Oracle | 237 | 2 years ago | 57 | apache-2.0 | Java | |||||
Kafka Source Connector For Oracle | ||||||||||
Abris | 195 | 5 | 5 months ago | 17 | October 06, 2020 | 9 | apache-2.0 | Scala | ||
Avro SerDe for Apache Spark structured APIs. | ||||||||||
Srclient | 191 | 16 | 10 days ago | 27 | June 03, 2022 | 2 | apache-2.0 | Go | ||
Golang Client for Schema Registry | ||||||||||
Operators | 179 | 2 years ago | 18 | apache-2.0 | Shell | |||||
Collection of Kubernetes Operators built with KUDO. |
This application is a scheduler for low-frequency and long-term scheduling of delayed messages to Kafka topics.
This component was initially designed for Sky's metadata ingestion pipeline. We wanted to manage content expiry (for scheduled airings or on-demand assets) in one single component, instead of implementing the expiry logic on all consumers.
Given that the pipeline is based on Kafka, it felt natural to use it as input, output and data store.
The Kafka Message Scheduler (KMS for short) consumes messages from configured source (schedule) topics. On this topic:
A schedule is composed of:
The KMS is responsible for sending the actual message to the specified topic at the specified time.
Note
If the timestamp of when to deliver the message is in the past, the schedule will be sent immediately.
The Schedule ID can be used to delete a scheduled message, via a delete message (with a null message value) in the source topic.
When the KMS starts up it uses the kafka-topic-loader to consume all messages from the configured schedule-topics
and populate the scheduling actors state. Once this has completed, all of the schedules loaded are scheduled and the application will start normal processing. This means that schedules that have been fired and tombstoned, but not compacted yet, will not be replayed during startup.
To generate the avro schema from the Schedule case class, run sbt schema
. The schema will be written to
avro/target/schemas/schedule.avsc
.
docker-compose pull && docker-compose up -d
With the services running, you can send a message to the defined scheduler topic (scheduler
in the example
above). See the Schema section for details of generating the Avro schema to be used.
Metrics are exposed and reported using Kamon. By default, the Kamon Prometheus reporter is used for reporting and the scraping endpoint for Prometheus is exposed on port 9095
(this is configurable by setting the PROMETHEUS_SCRAPING_ENDPOINT_PORT
environment variable).
Prometheus is included as part of the docker-compose and will expose a monitoring dashboard on port 9090
.
The schedule-topics
must be configured to use log compaction. This is for two reasons:
schedule-topics
to reconstruct its state - in case of a restart of the
KMS, this ensures that schedules are not lost.It is advised that the log compaction configuration of the schedule-topics
is quite aggressive to keep the restart times low, see below for recommended configuration:
cleanup.policy: compact
delete.retention.ms: 3600000
min.compaction.lag.ms: 0
min.cleanable.dirty.ratio: "0.1"
segment.ms: 86400000
segment.bytes: 100000000
Until this issue is addressed the KMS does not fully support horizontal scaling. Multiple instances can be run, and Kafka will balance the partitions, however schedules are likely to be duplicated as when a rebalance happens the state for the rebalanced partition will not be removed from the original instance. If there is a desire to run multiple instances before that issue is addressed, it is best to not attempt dynamic scaling, but to start with your desired number of instances.