Perfect-Kafka
This project provides an express Swift wrapper of librdkafka.
This package builds with Swift Package Manager and is part of the Perfect project but can also be used as an independent module.
Release Notes for MacOS X
Before importing this library, please install librdkafka first:
$ brew install librdkafka
Please also note that a proper pkg-config path setting is required:
$ export PKG_CONFIG_PATH="/usr/local/lib/pkgconfig"
Release Notes for Linux
Before importing this library, please install librdkafka-dev first:
$ sudo apt-get install librdkafka-dev
Quick Start
Kafka Client Configurations
Before starting any stream operations, it is necessary to apply settings to clients, i.e., producers or consumers.
Perfect Kafka provides two different categories of configuration, i.e. Kafka.Config()
for global configurations and Kafka.TopicConfig()
for topic configurations.
Initialization of Global Configurations
To create a configuration set with default value settings, simple call:
let conf = try Kafka.Config()
or, if another configuration based on an existing one can be also duplicated in such a form:
let conf = try Kafka.Config() // this will keep the original settings and duplicate a new one let conf2 = try Kafka.Config(conf)
Initialization of Topic Configurations
Topic configuration shares the same initialization fashion with global configuration.
To create a topic configuration with default settings, call:
let conf = try Kafka.TopicConfig()
or, if another configuration based on an existing one can be also duplicated in such a form:
let conf = try Kafka.TopicConfig() // this will keep the original settings and duplicate a new one let conf2 = try Kafka.TopicConfig(conf)
Access Settings of Configuration
Both Kafka.Config
and Kafka.TopicConfig
have the same api of accessing settings.
List All Variables with Value
Kafka.Config.properties
and Kafka.TopicConfig.properties
provides dictionary type settings:
// this will print out all variables in a configuration print(conf.properties) // for example, it will print out something like: // ["topic.metadata.refresh.fast.interval.ms": "250", // "receive.message.max.bytes": "100000000", ...]
Get a Variable Value
Call get()
to retrieve the value from a specific variable:
let maxBytes = try conf.get("receive.message.max.bytes") // maxBytes would be "100000000" by default
Set a Variable with New Value
Call set()
to save settings for a specific variable:
// this will restrict message receiving buffer to 1MB try conf.set("receive.message.max.bytes", "1048576")
Producer
Perfect-Kafka provides a Producer class to send data / message to Kafka hosts. Producer can send a message one at a time, or sent multiple messages in a batch. Messages can be either text string or binary bytes.
let producer = try Producer("VideoTest") let brokers = producer.connect(brokers: "host:9092") if brokers > 0 { let _ = try producer.send(message: "hello, world!") }
Before sending any actual messages, a few steps are required to setup the connection to Kafka hosts.
Producer Instance with a Topic
To initialize a Producer instance, a topic name is required no matter whether this topic exists in the Kafka hosts or not.
If the topic didn't exist when connected to Kafka hosts / brokers, Producer()
would try to create a new one; Otherwise it would use the existing topic for further operations.
For example, the demo below shows how to start a producer with a topic named "VideoTest":
let producer = try Producer("VideoTest")
Connect to Brokers
Use method connect()
to connect to one or more message brokers, i.e., Kafka hosts ( host and port ):
let brokers = producer.connect(brokers: "host1:9092,host2:9092,host3:9092")
If success, it will return the number of hosts that connected.
Alternatively, it is also possible to connect to brokers by different parameter fashions, take example, hosts can be an array of string:
let brokers = producer.connect(brokers: ["host1:9092", "host2:9092", "host3:9092"])
or dictionary:
swift
let brokers = producer.connect(brokers: ["host1": 9092, "host2": 9092, "host3": 9092])
Send Messages
Perfect Kafka allows to send either text or binary messages to brokers one at a time or in a batch.
Method | Description | Returns |
---|---|---|
send(message: String, key: String? = nil) |
a text message with an optional key to send | an Int64 message id |
send(message: [Int8], key: [Int8] = []) |
a binary message with an optional key to send | an Int64 message id |
send(messages: [(String, String?)]) |
text messages with optional keys in an array | [Int64] message IDs for each message |
send(messages: [([Int8], [Int8])]) |
binary messages with optional keys in an array | [Int64] message IDs for each message |
Sent or Not
Perfect Kafka send()
is asynchronous function so the library provides a few extra methods to determine the sending status of each message.
OnSent()
callback. If set properly, each message will call this event once actually sent. For example:producer.OnSent = { print("msg #\($0) was sent") }
. The only parameter of this event is the Int64 message id returned bysend()
.producer.outbox
is an [Int64] array to indicate the messages in sending queue. NOTE As a high performant streaming platform, the existence of messages in outbox doesn't mean that they were failed to send, so don't try to resend these message unless it was explicitly confirmed that they were failed to send.OnError()
callback. Producer will call this event if something wrong, e.g.,producer.OnError = { print("error: \($0)") }
will print out the error message if happen.flush(_ seconds: Int)
method can help wait seconds for clearing the message queue and flushing the outbox.
Consumer
Before actually receiving messages from Kafka with a specific topic, a few procedures are required to initialize a Consumer instance:
let consumer = try Consumer("VideoTest") let brokers = consumer.connect(brokers: ["host1": 9092, "host2": 9092, "host3": 9092]) guard brokers > 0 else { // connection failed }//end guard
Partitions
Once connected, it is a good idea to get the information from the brokers to see if there are sufficient resources, i.e., partitions, for further operations:
let info = try consumer.brokerInfo() print(info)
The above variable info
is a MetaData
structure as reference below:
Member | Type | Description |
---|---|---|
brokers | [Broker] | An array of Broker structure |
topics | [Topic] | An array of Topic structure |
Structure Broker
stores the information of a broker:
Member | Type | Description |
---|---|---|
id | Int | Broker Id |
host | String | Host name of the broker |
port | Int | Host port that listens |
The major content of Topic
structure is to record how many partitions are using in such a topic:
Member | Type | Description |
---|---|---|
name | String | Topic name |
err | Exception | Topic error reported by broker |
partitions | [Partition] | Partitions of this topic |
Data structure Partition
is vitally important to indicate the partition id for messaging:
Member | Type | Description |
---|---|---|
id | Int | Partition Id - use this to start / stop messaging |
err | Exception | Partition error reported by broker |
leader | Int | Leader broker |
replicas | [Int] | Replica brokers |
isrs | [Int] | In-Sync-Replica brokers |
Practically, partition info could be acquired by way below:
let consumer = try Consumer("VideoTest") let brokers = consumer.connect(brokers: ["host1": 9092, "host2": 9092, "host3": 9092]) guard brokers > 0 else { // connection failed }//end guard consumer.OnArrival = { m in print("message : #\(m.offset) \(m.text)")} let info = try consumer.brokerInfo() guard info.topics.count > 0 else { // no topic found }//end guard guard info.topics[0].name == "VideoTest" else { // it is not the topic we want }//end guar let partitions = info.topics[0].partitions
Download Messages From A Partition
Code below shows how to download messages from a partition. In this demo, we assume let partId = partitions[0].id
:
consumer.OnArrival = { m in print("message #\(m.offset) : \(m.text)") }//end event // start downloading try consumer.start(partition: partId) // run until end of program while(notEndOfProgram) { let total = try consumer.poll(partition: partId) print("\(total) messages arrived in this moment") }//end while consumer.stop(partId)
Now we take a walk through:
Firstly, OnArrival()
event is a callback with a Message
data structure:
Member | Type | Description |
---|---|---|
err | Exception | Error: if the message is good or not |
topic | String | topic name of the message |
partition | Int | partition of the message |
isText | Bool | if the message is a valid UTF-8 text or not |
data | [Int8] | the original binary data of message body |
text | String | decoded message body in a UTF-8 string, if isText |
keyIsText | Bool | if the key is a valid UTF-8 text |
keybuf | [Int8] | the original binary data of optional key |
key | String | decoded key in a UTF-8 string, if keyIsText |
offset | Int64 | offset inside the topic |
Secondly, call start()
to start download messages: func start(_ from: Position = .BEGIN, partition: Int32 = RD_KAFKA_PARTITION_UA)
, here are the parameter details:
- from: Position, from which position of the messages in the partition to download. Valid value can be .BEGIN
to indicate downloading every messages from the very beginning, or .END
to download the most reason one, or .STORED
to download the previous stored messages in case of failure, or .SPECIFY(Int64)
to start downloading from a specific location. NOTE use func store(_ offset: Int64, partition: Int32 = RD_KAFKA_PARTITION_UA)
to store a specific message if .STORED
is needed.
- partition: Int32, the partition id.
Then Perfect Kafka provides the poll()
function to wait a short while to listen the activity of a specific partition:
func poll(_ timeout: UInt = 10, partition: Int32 = RD_KAFKA_PARTITION_UA)
. The timeout
is the milliseconds to wait for polling.
Finally, call stop()
to end the messaging.