ElasticSearch with ActiveRecord

ElasticSearch (ES) is a powerful Full Text Search Engine based on Apache Lucene.

A key characteristic of ElasticSearch is that it’s distributed at it’s core, meaning that you can easily scale it horizontally for the purpose of redundancy or performance.

ElasticSearch can also be used as data store engine, but it has some disadvantages:

1) Security - ElasticSearch does not provide any internal security or access control system. The only way to protect ES from external access is with a firewall.

2) Computation – There is limited support for advanced computation on the database side.

3) Data Availability – Data in ElasticSearch is available in “near real time”, - meaning that if you submit a comment to a post and refresh the page, it might not show up as the index is still updating.

4) Durability – ES is distributed and relatively stable, but backups are not as high priority as in other data store solutions. This is an important consideration when ElasticSearch is your primary data store.

To start ElasticSearch implementation into Rails application, we need to choose proper gem.

You can choose between: https://github.com/toptal/chewy from Toptal team and https://github.com/elastic/elasticsearch-ruby from Elastic.

In this article I show how to use elasticsearch-ruby in our project prototype.

So, we start from our gems:

gem 'elasticsearch-model'
gem 'elasticsearch-rails'

And two lines of code into our Job model:

require 'elasticsearch/model'

class Job < ApplicationRecord


  validates :title, presence: true
  validates :description, presence: true
  validates :email, presence: true, format: {
    with: /\A([^@\s]+)@((?:[-a-z0-9]+\.)+[a-z]{2,})\z/i,
    unless: proc { |vacancy| vacancy.email.blank? }
  validates :currency, inclusion: {
    allow_blank: true

  scope :approved, -> { where('approved_at IS NOT NULL') }
  scope :not_approved, -> { where(approved_at: nil) }
  scope :descent_order, -> { order('id DESC') }
  has_one_attached :company_logo

  # Here two lines of code for ElasticSearch implementation
  include Elasticsearch::Model
  include Elasticsearch::Model::Callbacks

Let’s populate our search index:

irb(main):019:0> Job.import(force: true)

Great, let’s tale a look into our seeds.

irb(main):018:0> Job.all
  Job Load (1.0ms)  SELECT  "jobs".* FROM "jobs" LIMIT $1  [["LIMIT", 11]]
=> #<ActiveRecord::Relation [
  #<Job id: 1, title: "Rails developer", description: "New Jobs", email: "test@google.com", company: "Google", website: "http://google.com", salary_min: 100, salary_max: 1000, currency: "", status: nil, approved: nil, expire_at: nil, approved_at: nil, created_at: "2018-02-03 17:55:21", updated_at: "2018-02-03 22:54:29">, 
  #<Job id: 2, title: "Go developer", description: "Go dev", email: "gojob@mail.com", company: "Yandex", website: "http://go.com", salary_min: 1000, salary_max: 3000, currency: "", status: nil, approved: nil, expire_at: nil, approved_at: nil, created_at: "2018-02-04 16:52:09", updated_at: "2018-02-04 16:56:41">, 
  #<Job id: 3, title: "Java engineer", description: "Java engineer", email: "java_job@mail.com", company: "Yandex", website: "http://gocompany.ru", salary_min: 500, salary_max: 1000, currency: "", status: nil, approved: nil, expire_at: nil, approved_at: nil, created_at: "2018-02-04 16:52:48", updated_at: "2018-02-04 17:33:41">

Let’s call to search method on our Job model with search string:

irb(main):026:0> Job.search("Rails").results.size
=> 1

irb(main):025:0> Job.search("Rails").results.first      
=> #<Elasticsearch::Model::Response::Result:0x00007fbdb33e45e0 @result=#<Elasticsearch::Model::HashWrapper _id="1" _index="jobs" _score=0.2876821 _source=#<Elasticsearch::Model::HashWrapper approved=nil approved_at=nil company="Google" created_at="2018-02-03T17:55:21.627Z" currency="" description="New Jobs" email="test@google.com" expire_at=nil id=1 salary_max=1000 salary_min=100 status=nil title="Rails developer" updated_at="2018-02-03T22:54:29.857Z" website="http://google.com"> _type="job">>

As you see, basic implementation is pretty easy, in the next posts I’ll write about Chewy gem and using ElasticSearch as data storage layer.

Stay tuned.