bazel test //...
Project Name | Stars | Downloads | Repos Using This | Packages Using This | Most Recent Commit | Total Releases | Latest Release | Open Issues | License | Language |
---|---|---|---|---|---|---|---|---|---|---|
Maxwell | 3,589 | 2 | 8 days ago | 84 | July 29, 2022 | 219 | other | Java | ||
Maxwell's daemon, a mysql-to-json kafka producer | ||||||||||
Schema Registry | 1,929 | 19 hours ago | 257 | other | Java | |||||
Confluent Schema Registry for Kafka | ||||||||||
Pmacct | 909 | a day ago | 41 | other | C | |||||
pmacct is a small set of multi-purpose passive network monitoring tools [NetFlow IPFIX sFlow libpcap BGP BMP RPKI IGP Streaming Telemetry]. | ||||||||||
Kt | 893 | 4 months ago | 24 | September 19, 2017 | 5 | mit | Go | |||
Kafka command line tool that likes JSON | ||||||||||
Zerocode | 712 | 1 | 3 | 11 days ago | 27 | December 06, 2020 | 103 | apache-2.0 | Java | |
A community-developed, free, open source, microservices API automation and load testing framework built using JUnit core runners for Http REST, SOAP, Security, Database, Kafka and much more. Zerocode Open Source enables you to create, change, orchestrate and maintain your automated test cases declaratively with absolute ease. | ||||||||||
Kafka Pixy | 693 | 8 months ago | 16 | July 09, 2021 | 13 | apache-2.0 | Go | |||
gRPC/REST proxy for Kafka | ||||||||||
Json Data Generator | 368 | 10 months ago | 3 | May 09, 2022 | 21 | apache-2.0 | Java | |||
A robust, generic, streaming random json data generator for your data | ||||||||||
Karapace | 283 | a day ago | 55 | apache-2.0 | HTML | |||||
Karapace - Your Apache Kafka® essentials in one tool | ||||||||||
Storagetapper | 269 | a year ago | 4 | November 19, 2021 | 21 | mit | Go | |||
StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service | ||||||||||
Kafka Streams | 210 | 6 years ago | 2 | apache-2.0 | Java | |||||
Code examples for working with Kafka Streams |
This source demonstrates how to process a stream of sensor data using Kafka Streams.
The sensors produce a stream of records, including sensor ID, a timestamp and the current state (on or off). The desired result is a stream of records enriched with the duration the sensor has been in this state.
For example, a stream
Name | Timestamp | State |
---|---|---|
Sensor 1 |
1984-01-22T15:45:00Z |
off |
Sensor 1 |
1984-01-22T15:45:10Z |
off |
Sensor 1 |
1984-01-22T15:45:30Z |
on |
Sensor 1 |
1984-01-22T15:46:30Z |
off |
should produce
Name | Timestamp | State | Duration |
---|---|---|---|
Sensor 1 |
1984-01-22T15:45:00Z |
off |
10s |
Sensor 1 |
1984-01-22T15:45:00Z |
off |
30s |
Sensor 1 |
1984-01-22T15:45:30Z |
on |
60s |
Which tells us that Sensor 1 was off from 15:45:00 for 30 seconds and on from 15:45:30 for 60 seconds.
Note that the second off reading produced an intermediate result.
Duplicate readings of the same state generate intermediate results, and delayed readings (timestamps preceding previously seen values) are treated as errors.
These are deliberate choices and can easily be changed.
Care has been taken to keep the business logic independent of implementation details like serialization formats.
The data model is in the model directory, the business logic in topology.
The tests test the topology with seven different (de-)serializers, protocol buffers, JSON, Apache Avro, the Confluent variants of them and Amazon Ion. Since the example needs an input format, a result format and a format for the state store we have 343 (73) different combinations which are all tested.
While this abstraction might not be necessary in practice, it demonstrates two important design considerations:
The business logic should only depend on a data model, not capabilities of the serialization mechanism.
We can simply use
Duration::between
,
which is a simple call and easy to understand and test, instead of cluttering our logic with
conversions and unnecessary error-prone calculations.
The choice of (de-)serializers should depend on the requirements, not on what is just at hand.
While internal processing pipelines tend (but don’t have) to use one serialization mechanism, it perfectly valid and a good design decision to use different mechanisms for parts interfacing with external components.
Since the business logic is independent of the serialization mechanism, changing it is simple and
normally does not require retesting. Be aware of subtle pitfalls tough, while our Java code uses
nanosecond precision, which are correctly handled with protocol buffers and JSON, Avro truncates the
time stamps to microseconds. We would not catch these in our tests since we use only second steps
for testing, but this would show up when using real world data and comparing with equals
.
By refactoring the business logic to depend only on an abstract store, we speed up testing by a
factor of twenty
(bazel test //src/test/java/de/melsicon/kafka/sensors/logic:all
vs.
//src/test/java/de/melsicon/kafka/sensors/topology:all
), which demonstrates a potential
for improvement in development speed and testability.
To run all tests, use
bazel test //...
To run a single test, use
bazel test //src/test/java/de/melsicon/kafka/sensors/topology:all
The tests run with an embedded Kafka and mock schema registry, when necessary.
The main app needs Kafka running at localhost, port 9092 (see application.yaml). Start it with
bazel run //:kafka-sensors
Watch the results with
kafka-console-consumer --bootstrap-server localhost:9092 --topic result-topic
and publish sensor values with
kafka-console-producer --broker-list localhost:9092 --topic input-topic
The main app runs with JSON in- and output, so you can use sample sensor values like
{"id":"7331","time":443634300.0,"state":"off"}
{"id":"7331","time":443634310.0,"state":"off"}
{"id":"7331","time":443634330.0,"state":"on"}
{"id":"7331","time":443634390.0,"state":"off"}
As noted in Implementation of Business Logic the business login is independent of the serialization, in the spirit of hexagonal architecture. This of course requires some mapping, where we mostly use MapStruct for. This necessitates some limitations in data model naming conventions. MapStruct uses a fixed und quite unflexible accessor naming strategy, so you can’t really decide that protocol buffers should have one convention but Immutables another. Especially for Immutables we are forced to use JavaBeans-style naming convention, although this is not a JEE application.
Copyright 2019-2021 melsicon GmbH
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this material except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.