Hey there.
Today we’ll discuss migrations and related problems on a large tables.
Let’s start from easy thing - just add column for a table with 10 million rows.
Task 1:
Add two columns to the products table with default values.
class AddFieldsToProducts < ActiveRecord::Migration
def change
add_column :products, :shipped, :boolean, default: false
add_column :products, :cancelled, :boolean, default: false
end
end
In this case your migration takes hours, and the products table will be locked.
Solution 1:
What you should do? Postgres can create null columns super fast even on a very big tables.
We can split our migration to the two tasks:
- Creating the columns
- Populating the default value
You can do the last step in background for example with Sidekiq
or any other background job, or for example do a batch updates.
Cool. Let’s move forward.
Task 2:
Let’s change column type in a huge table.
Solution 2:
Instead, add a new column with a new name/type - you can add new column, update your code to use new one, and when you’ve completed your upgrades, you can DROP the column. Another integersting thing - with ActiveRecord you cou can hide the old column from your code like this:
class Product < ActiveRecord::Base
# TODO: You should remove this after completing migration
def self.columns
super.reject { |c| c.name == 'status' }
end
end
Task 3:
Removing a column
Solution 3:
In this case - instead of dropping column, your first step should be telling ActiveRecord to stop writing to this column and modify your code to not use it.
Let’s take a look into an example:
class Products < ActiveRecord::Base
def self.columns
super.reject { |column| column.name == "status" }
end
end
After deploying modified code, you can simple remove column as usual:
class RemoveStatusFromUsers < ActiveRecord::Migration
def change
remove_column :users, :status, :string
end
end
Task 4:
Renaming a column
Solution 4:
To rename a column in Rails and ActiveRecord we need to do 3 deployments.
As an example, let’s imagine we have a status
column on our products table which we would like to rename to current_status
.
You can start from the adding new column and tell ActiveRecord to write to both the old and the new columns, but keep reading from the old column.
Let’s add a column first:
class AddCurrentStatusToProducts < ActiveRecord::Migration
def change
add_column :products, :current_status, :datetime
end
end
Let’s tell AR to set and read from new column:
class Product < ActiveRecord::Base
def current_status=(value)
self[:status] = value
self[:current_status] = value
end
def current_status
self[:current_status]
end
end
Ok, let’s deploy your changes, and wait a little to sync all columns (btw you can use some background worker to populate data). After deploying you can use your new column.
class SyncColumnsInProducts < ActiveRecord::Migration
def up
execute "UPDATE products SET current_status = status;"
end
def down
end
end
Let’s use new column name:
class Product < ActiveRecord::Base
def self.columns
super.reject { |column| column.name == "status" }
end
end
After deploing this change, you can simple remove column:
class RemoveStatusFromProducts < ActiveRecord::Migration
def change
remove_column :products, :status
end
end
Bonus tip
If some parts of your application use heavy and slow queries, good choice here to use read-only replica of your main DB.
You can just setup it and route all your necessary queries into it. It can help you to avoid additional timeouts and perfomance problems on main DB.