Connecting your application to our event streaming broker
In this microlearning, we will explore how you can connect your custom build application to the event streaming broker that we provide. We will focus on the relevant configuration elements you must consider within your code. On top of that, we will provide some pseudo code illustrating a simple connection that should be used as an illustration and not as a production-ready solution. Note that this microlearning is not relevant for Mendix app developers. These developers should check out this microlearning.
Should you have any questions, please contact academy@emagiz.com.
1. Prerequisites
- Basic knowledge of the eMagiz platform
- Access to our Kafka cluster.
- Event streaming license activated
2. Key concepts
This microlearning centers around connecting to our event streaming broker from a high-code application.
- By connecting, in this context, we mean Being able to produce or consume data on topics managed within our Kafka Cluster.
- You can rapidly integrate with it by knowing how you can easily connect to our eMagiz Kafka Cluster.
3. Connecting your application to our event streaming broker
Connecting your application requires several configuration steps. We start by discussing the connection details. Subsequently, we will discuss message (de)serialization. We also discuss additional requirements for consuming messages.
3.1 Connection details
With the introduction of our Kafka broker, we will introduce the tenant concept. The name of the tenant is built up in the following way: emagiz-[customer]-[model]-[environment]
. Below, you can find the default connection settings that you will need to use to connect. Ask your eMagiz contact for the relevant specific connection details applicable to your connection.
Attribute | Format | Example | Explanation |
---|---|---|---|
Bootstrap servers | proxy-0.kafka.[tenant].kpn-dsh.com:9091, proxy-2.kafka.[tenant].kpn-dsh.com:9091, proxy-1.kafka.[tenant].kpn-dsh.com:9091 | proxy-0.kafka.emagiz-mcrlng-test.poc.kpn-dsh.com:9091,proxy-1.kafka.emagiz-mcrlng-test.poc.kpn-dsh.com:9091,proxy-2.kafka.emagiz-mcrlng-test.poc.kpn-dsh.com:9091 |
|
Security protocol | SSL | SSL | |
SSL keystore certificate | PEM encoded certificate string | -BEGIN CERTIFICATE-…. | It used to be a JKS file. It will be provided to you upon request. |
SSL keystore key | PKCS#8 PEM encoded string. RSA encrypted | -BEGIN ENCRYPTED PRIVATE KEY-…. | If you cannot handle the encrypted file or prefer it unencrypted, you can easily unencrypt it using the CLI command openssl pkcs8 -in [filename] -out [target-filename] |
SSL key passphrase | Will be provided to you on request |
Topic naming is of the utmost importance when connecting to our Event Streaming broker. Below, we outline the various naming conventions regarding topic naming.
Type | Format | Example |
---|---|---|
Topic name | stream.emagiz---[tenant]-[topicname] | stream.emagiz---emagiz-mcrlng-test.topic |
Produce to topic | stream.emagiz---[tenant]-[topicname].[tenant] | stream.emagiz---emagiz-mcrlng-test.topic.emagiz-mcrlng-test |
Consume from topic | stream.emagiz---[tenant]-[topicname]..* | stream.emagiz---[tenant]-[topicname]\..* |
3.1.1 Recommended consumer/producer settings
When many messages need to be produced to a topic, or a topic that receives many messages needs to be consumed, the settings used for the Kafka components can strongly affect the throughput speed. Based on our testing, we recommend using the following settings:
Setting | Type | Value | Explanation |
---|---|---|---|
Batch size | Producer | 200000 | The number of messages the producer will attempt to send at once. This prevents every individual message from being sent on its own. |
Linger | Producer | 300 ms | The time to wait until enough messages are available to send a single batch. |
Minimum fetch amount | Consumer | 100000 | The number of messages the consumer will attempt to receive at once. When fewer messages are available, those will be fetched after the wait timeout has passed. |
3.2 Message (de)serialization
Our event broker requires message serialization. This means that any message produced for the event broker needs to be serialized, or it may fail to be processed by other systems. Similarly, all consumers have to apply message deserialization.
eMagiz will provide a protobuf envelope with the format. This file can be used with the protoc cli tool to compile it into a language of choice. You can find the envelope .
3.2.1 Producing messages
When producing messages, the Kafka message key should be built up according to the envelope specified in the attached .proto file.
As an example, the message key can be built up as follows:
Field | Value | Explanation |
---|---|---|
key | * | Free text that references something unique. |
header.identifier.tenant | tenant name | Required field, needs to point to the [customer]-[model]-[environment] belonging to the topic you are connecting to. |
header.identifier.client | * | A descriptive client name representing your application. Used for traceability and support. For example, your order-processing app could use ‘order-app’. |
header.retained | False | Not used at this point, can default to false. |
header.qos | 1 (or RELIABLE) | Quality of Service identifier, either 0 (best effort) or 1 (reliable). We recommend defaulting to 1. |
The data envelope that contains the actual data is significantly more straightforward to set up. Only the ‘payload’ attribute needs to be set with your payload.
How to use the proto files
- For Java applications, a is attached below. These contain the (de)serializers for both the Key and Data parts of messages:
- com.emagiz.components.kafka.dsh.serialization.DataEnvelopeSerializer for data
- com.emagiz.components.kafka.dsh.serialization.KeyEnvelopeSerializer for keys. Note that the KeyEnvelopeSerializer requires two config properties to be set:
- emagiz.dsh.kafka.tenant used to provision the ‘header.identifier.tenant’ field.
- emagiz.dsh.kafka.publisher used to provision the header.identifier.client field to a value describing the producer.
- For example, to generate Python sources from the envelope (assuming the envelope is called envelope.proto and is present in the working directory), you can execute the following:
- protoc --python_out=. envelope.proto
, which will result in a envelope_pb2.py file that you can import into your Python code. An example producer with Python is attached to this.
- protoc --python_out=. envelope.proto
3.2.2 Consuming messages
Similar to producing, the consuming side must also be able to deserialize the messages on the topic. For this, a deserializer should be set. When using Java applications, the following deserializers can be used from the provided
:- com.emagiz.components.kafka.dsh.serialization.DataEnvelopeDeserializer for data
- com.emagiz.components.kafka.dsh.serialization.KeyEnvelopeDeserializer for keys
For other languages, refer to the section above regarding producing messages for general instructions on using the provided proto file to create files for deserializing.
3.3 Other requirements for consuming messages
For your consumer to be authorized to access the topic, a naming practice for the consumer group ID will be enforced. To connect, the consumer group ID must start with tenant.username.. The consumer group ID emagiz-mcrlng-test.user.[free-text]_0 can be considered as an example.
Furthermore, in contrast to how topics are currently created, topics may have multiple suffixes, depending on the data source. Therefore, we suggest consuming from a topic pattern rather than a fixed list of names. For example, when connecting to the available sample topic ‘consume’ that is shown below, we recommend connecting on the pattern stream.emagiz---emagiz-mcrlng-test-consume.*
instead of the fully qualified name stream.emagiz---emagiz-mcrlng-test-consume.emagiz-mcrlng-test
. If that is not possible for your application, the fully qualified topic name can be used for now.
4. Key takeaways
- Ask your eMagiz contact for the relevant specific connection details applicable to your connection.
- The topic naming is paramount when connecting to our Event Streaming broker.
- As a general best practice, keep a connection open and use that to send or receive multiple messages instead of opening a new connection each time you wish to consume or produce a message.
- Our event broker requires message serialization.
- This requires a protobuf envelope.
- For your consumer to be authorized to access the topic, a naming practice for the consumer group ID will be enforced.
5. Suggested Additional Readings
If you are interested in this topic and want more information on it, please see the following links: