Marat's Notes

Apache Kafka

What is Apache Kafka?

https://kafka.apache.org/

Let’s check official Kafka desсription:

Apache Kafka is a distributed streaming platform.

Again. Kafka is not a message bus. It is a distributed streaming platform.

How it works?

scheme

What Kafka can do/ can’t do:

1) Kafka can send data with just only one pattern: producer - consumer

2) Kafka store messages after delivery (7 days by default)

3) Performance - about 100k in second (not very relevant but 20k in RabbitMQ for example)

4) Clustering

5) Kafka can save messages to disk

6) Availability - partitions/replication

7) Exactly-once delivery

twitter

Usage

As I said before - Kafka is a distributed system that runs in a cluster. Each node in the cluster is called a Kafka broker. Broker is a single Kafka process that operates in a cluster.

For example you want to track users activity in your app. Searches, clicks, page views etc - you can arrange them to different topics and then get this info for statistics services for example.

Another popular case - log rotating or event-based apps architecture layer.

Install and run Kafka and Zookeeper

brew cask install homebrew/cask-versions/java8 
brew install kafka
brew services start zookeeper
brew services start kafka

Of course you can use docker images from here for example: https://hub.docker.com/r/wurstmeister/kafka/

Let’s try to test local Kafka installation:

A topic is a category or feed name to which records are published.

Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.

Let’s try to create first test topic:

kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Created topic "test".

Now we will initialize the Kafka producer console, which will listen to localhost at port 9092 at topic test:

kafka-console-producer --broker-list localhost:9092 --topic test
>send 1 message
>send 2 message
>works!

Now we will initialize the Kafka consumer console, which will listen to bootstrap server localhost at port 9092 at topic test from beginning:

kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning
send 1 message
send 2 message
works!

Super simple!

Populare ruby/rails solutions to run.

For ruby and rails you can choose from different gems/solutions to run your and communicate with Kafka.

Karafka + Waterdrop

https://github.com/karafka/karafka

https://github.com/karafka/waterdrop

Add gems:

gem 'waterdrop'
gem 'karafka'

Setup your connection:

WaterDrop.setup do |config|
  config.kafka.seed_brokers = %w[kafka://localhost:9092]
end

Send your message:

WaterDrop::SyncProducer.call('event message', topic: 'events-topic-internal')

Ruby kafka

https://github.com/zendesk/ruby-kafka

require 'kafka'
class KafkaClient
  def self.client
    @client ||= Kafka.new(seed_brokers: ENV['CONNECTION_STRING'])
  end
  
  def self.produce(data, topic:)
    client.producer.produce(data, topic: topic)
    client.producer.deliver_messages
end

Call it!

KafkaClient.produce(my_message_data, topic: 'very_useful_messages')

Delifery boy

Another simple way to publish messages to Kafka from Ruby applications.

https://github.com/zendesk/delivery_boy

# app/controllers/jobs_controller.rb
class JobsController < ApplicationController
  def show
    @job = Job.find(params[:id])
    event_for_kafka = {
      job_id: @job.id,
      ip_address: request.remote_ip,
      timestamp: Time.now,
      user_id: current_user
    }
    # DeliveryBoy.deliver(event.to_json, topic: "job-views") - BLOCKS THE SERVER PROCESS
    # Calling `deliver_async` will enqueue the message in an asynchronous
    # Kafka producer that will periodically deliver all pending messages.
    DeliveryBoy.deliver_async(event.to_json, topic: "job-views")
  end
end

Phobos

Simplifying Kafka for ruby apps - https://github.com/phobos/phobos.

Phobos is a micro framework and library for applications dealing with Apache Kafka.

Install Phobos:

gem 'phobos'

Add configuration:

phobos init
    create  config/phobos.yml
    create  phobos_boot.rb

Add Producer:

class JobProducer
  include Phobos::Producer
end

And use it:

JobProducer.producer.publish('job-topic', 'job-message-payload', 'partition and message key')

Add handler (Consumer)

class JobHandler
  include Phobos::Handler

  def consume(payload, metadata)
    # payload  - This is the content of your Kafka message, Phobos does not attempt to
    #            parse this content, it is delivered raw to you
    # metadata - A hash with useful information about this event, it contains: key,
    #            partition, offset, retry_count, topic, group_id, and listener_id
    # your code here
  end
end

Configure your Phobos installation:

config/phobos.yml

listeners:
  - handler: JobHandler
    topic: job-topic
    group_id: job-1

Run Phobos:

phobos start

Rafka

Another insteresting project:

https://github.com/skroutz/rafka

Super useful article and services for systemd implementation on this link:

https://engineering.skroutz.gr/blog/kafka-rails-integration/

Conclusion

Thats all. As you can see, Kafka can be super useful on different part of your apps.

Just try to choose there parts correctly

Stay tuned.

Using Apache Kafka with Ruby/Rails