Using Apache Kafka with Ruby/Rails


Apache Kafka

What is Apache Kafka?

https://kafka.apache.org/

Let’s check official Kafka desсription:

Apache Kafka is a distributed streaming platform.

Again. Kafka is not a message bus. It is a distributed streaming platform.

How it works?

scheme

What Kafka can do/ can’t do:

1) Kafka can send data with just only one pattern: producer - consumer

2) Kafka store messages after delivery (7 days by default)

3) Performance - about 100k in second (not very relevant but 20k in RabbitMQ for example)

4) Clustering

5) Kafka can save messages to disk

6) Availability - partitions/replication

7) Exactly-once delivery

twitter

Usage

As I said before - Kafka is a distributed system that runs in a cluster. Each node in the cluster is called a Kafka broker. Broker is a single Kafka process that operates in a cluster.

For example you want to track users activity in your app. Searches, clicks, page views etc - you can arrange them to different topics and then get this info for statistics services for example.

Another popular case - log rotating or event-based apps architecture layer.

Install and run Kafka and Zookeeper

brew cask install homebrew/cask-versions/java8 
brew install kafka
brew services start zookeeper
brew services start kafka

Of course you can use docker images from here for example: https://hub.docker.com/r/wurstmeister/kafka/

Let’s try to test local Kafka installation:

A topic is a category or feed name to which records are published.

Topics in Kafka are always multi-subscriber; that is, a topic can have zero, one, or many consumers that subscribe to the data written to it.

Let’s try to create first test topic:

kafka-topics --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
Created topic "test".

Now we will initialize the Kafka producer console, which will listen to localhost at port 9092 at topic test:

kafka-console-producer --broker-list localhost:9092 --topic test
>send 1 message
>send 2 message
>works!

Now we will initialize the Kafka consumer console, which will listen to bootstrap server localhost at port 9092 at topic test from beginning:

kafka-console-consumer --bootstrap-server localhost:9092 --topic test --from-beginning
send 1 message
send 2 message
works!

Super simple!

Populare ruby/rails solutions to run.

For ruby and rails you can choose from different gems/solutions to run your and communicate with Kafka.

Karafka + Waterdrop

https://github.com/karafka/karafka

https://github.com/karafka/waterdrop

Add gems:

gem 'waterdrop'
gem 'karafka'

Setup your connection:

WaterDrop.setup do |config|
  config.kafka.seed_brokers = %w[kafka://localhost:9092]
end

Send your message:

WaterDrop::SyncProducer.call('event message', topic: 'events-topic-internal')

Ruby kafka

https://github.com/zendesk/ruby-kafka

require 'kafka'
class KafkaClient
  def self.client
    @client ||= Kafka.new(seed_brokers: ENV['CONNECTION_STRING'])
  end
  
  def self.produce(data, topic:)
    client.producer.produce(data, topic: topic)
    client.producer.deliver_messages
end

Call it!

KafkaClient.produce(my_message_data, topic: 'very_useful_messages')

Delifery boy

Another simple way to publish messages to Kafka from Ruby applications.

https://github.com/zendesk/delivery_boy

# app/controllers/jobs_controller.rb
class JobsController < ApplicationController
  def show
    @job = Job.find(params[:id])
    event_for_kafka = {
      job_id: @job.id,
      ip_address: request.remote_ip,
      timestamp: Time.now,
      user_id: current_user
    }
    # DeliveryBoy.deliver(event.to_json, topic: "job-views") - BLOCKS THE SERVER PROCESS
    # Calling `deliver_async` will enqueue the message in an asynchronous
    # Kafka producer that will periodically deliver all pending messages.
    DeliveryBoy.deliver_async(event.to_json, topic: "job-views")
  end
end

Phobos

Simplifying Kafka for ruby apps - https://github.com/phobos/phobos.

Phobos is a micro framework and library for applications dealing with Apache Kafka.

Install Phobos:

gem 'phobos'

Add configuration:

phobos init
    create  config/phobos.yml
    create  phobos_boot.rb

Add Producer:

class JobProducer
  include Phobos::Producer
end

And use it:

JobProducer.producer.publish('job-topic', 'job-message-payload', 'partition and message key')

Add handler (Consumer)

class JobHandler
  include Phobos::Handler

  def consume(payload, metadata)
    # payload  - This is the content of your Kafka message, Phobos does not attempt to
    #            parse this content, it is delivered raw to you
    # metadata - A hash with useful information about this event, it contains: key,
    #            partition, offset, retry_count, topic, group_id, and listener_id
    # your code here
  end
end

Configure your Phobos installation:

config/phobos.yml

listeners:
  - handler: JobHandler
    topic: job-topic
    group_id: job-1

Run Phobos:

phobos start

Rafka

Another insteresting project:

https://github.com/skroutz/rafka

Super useful article and services for systemd implementation on this link:

https://engineering.skroutz.gr/blog/kafka-rails-integration/

Conclusion

Thats all. As you can see, Kafka can be super useful on different part of your apps.

Just try to choose there parts correctly

Stay tuned.