Skip to content
This repository was archived by the owner on May 4, 2019. It is now read-only.

Commit 0fa8b9f

Browse files
author
Sai Wong
committed
Merge pull request #5 from wework/post/rails-migrate
Post: How to add columns with default values to really large tables
2 parents 12fcc17 + 901af22 commit 0fa8b9f

File tree

1 file changed

+157
-0
lines changed

1 file changed

+157
-0
lines changed
Lines changed: 157 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,157 @@
1+
---
2+
layout: post
3+
title: Rails Migration - How to add columns with default values to really large tables in Postgres + Rails
4+
author: Sai Wong
5+
summary:
6+
image: http://res.cloudinary.com/wework/image/upload/s--GnhXQxhq--/c_scale,q_jpegmini:1,w_1000/v1445269362/engineering/shutterstock_262325693.jpg
7+
categories: data
8+
---
9+
10+
We had a fairly simple task of adding a couple of columns to a table for our
11+
Rails app. This is normally a straight forward operation and a boring task at
12+
best but for us, the fun only just started. The table in question was a fairly
13+
large table with lots of reads on it and in the spirit of no down time, this
14+
is the adventure we had.
15+
16+
# TL:DR;
17+
18+
Jump straight to the [solution](#attempt-3)!
19+
20+
# The Task
21+
- Add two columns to the notifications table
22+
- Both columns have default values
23+
- Table has 2.2 MM rows!
24+
25+
# Attempt #1
26+
```ruby
27+
class AddPhoneFlagsToNotifications < ActiveRecord::Migration
28+
def change
29+
add_column :notifications, :text_message, :boolean, default: false
30+
add_column :notifications, :call_phone, :boolean, default: false
31+
end
32+
end
33+
```
34+
35+
## Problem
36+
- Migration takes hours!
37+
- The notifications table is locked
38+
- Entire application grinds to a halt
39+
40+
## Reason
41+
- Column creation with default values causes all rows to be touched at the same time
42+
- Updates are a slow operation in Postgres since it has to guarantee consistency
43+
- That guarantee results in whole table locking
44+
45+
## Solution
46+
- Postgres can create null columns extremely fast! Even on a huge table!
47+
- We can split the work to two tasks, creating the columns and populating the default value
48+
49+
# Attempt #2
50+
51+
```ruby
52+
class AddPhoneFlagsToNotifications < ActiveRecord::Migration
53+
def change
54+
add_column :notifications, :text_message, :boolean
55+
add_column :notifications, :call_phone, :boolean
56+
57+
execute <<-SQL
58+
ALTER TABLE notifications
59+
ALTER COLUMN text_message SET DEFAULT false,
60+
ALTER COLUMN call_phone SET DEFAULT false
61+
SQL
62+
63+
last_id = Notification.last.id
64+
batch_size = 10000
65+
(0..last_id).step(batch_size).each do |from_id|
66+
to_id = from_id + batch_size
67+
execute <<-SQL
68+
UPDATE notifications
69+
SET
70+
text_message = false,
71+
call_phone = false
72+
WHERE id BETWEEN #{from_id} AND #{to_id}
73+
SQL
74+
end
75+
end
76+
end
77+
```
78+
79+
## Problem
80+
- Migration takes hours!
81+
- The notifications table is still locked!
82+
- Entire application grinds to a halt
83+
84+
## Reason
85+
- Rails migration tasks are always wrapped in a transaction to allow for rollbacks
86+
- The column adds AND the row updates are in one gigantic transaction!
87+
- Transactions guarantee consistency
88+
- That guarantee results in whole table locking again!
89+
90+
## Solution
91+
- You can disable the transaction handle in Rails migration by calling “disable_ddl_transaction!” in your migration task
92+
- But you have to handle transactions on your own
93+
- We can then run each step in its own transaction
94+
- Add our own error handling to rollback operation
95+
96+
# Attempt #3
97+
98+
```ruby
99+
class AddPhoneFlagsToNotifications < ActiveRecord::Migration
100+
disable_ddl_transaction!
101+
102+
def up
103+
ActiveRecord::Base.transaction do
104+
add_column :notifications, :text_message, :boolean, default: nil
105+
add_column :notifications, :call_phone, :boolean, default: nil
106+
107+
sql = <<-SQL
108+
ALTER TABLE notifications
109+
ALTER COLUMN text_message SET DEFAULT false,
110+
ALTER COLUMN call_phone SET DEFAULT false
111+
SQL
112+
execute(sql)
113+
end
114+
115+
116+
last_id = Notification.last.id
117+
batch_size = 10000
118+
(0..last_id).step(batch_size).each do |from_id|
119+
to_id = from_id + batch_size
120+
ActiveRecord::Base.transaction do
121+
execute <<-SQL
122+
UPDATE notifications
123+
SET
124+
text_message = false,
125+
call_phone = false
126+
WHERE id BETWEEN #{from_id} AND #{to_id}
127+
SQL
128+
end
129+
end
130+
131+
rescue => e
132+
# roll back our work
133+
down
134+
raise e
135+
end
136+
end
137+
```
138+
139+
## Result
140+
- Migration takes hours!
141+
- There is no table locking
142+
- Application is slower due to all the writes to notifications table
143+
- Nothing grinds to a halt
144+
145+
# Takeaways
146+
- Always be mindful of the number of rows affected in the migration
147+
- Be mindful of the transaction size
148+
- Leverage Postgres features
149+
150+
## Possible alternate solution
151+
- Handle NULL case in code to treat as the desired default value
152+
- Clean solution and quick turn around but required us to muck up the model to abstract out that case. Give that we may or may not have complete control over how that those values are extracted from the model, this may turn into lots of defensive code.
153+
- Add view in database to do mapping for us
154+
- Very clean solution though this would require us to maintain both the schema and the view whenever we do schema changes on to that table. Though we don't do changes on the schema often on this table, the extra maintance overhead was deemed not worth the value.
155+
- Add trigger to only update rows that are actively queried
156+
- Also very clean solution though it came down to data integrity and since our data eventually gets slurped up by our data team, having a sane state on our data was highest priority. This meant that having a NULL state on a Boolean was not desired. Ultimately, we could of added the trigger to handle any current requests and just made the migration run slowly to backfill lesser accessed rows. Since we were able to run the entire migration within a night, we decided it wasn't worth the additional hassle.
157+

0 commit comments

Comments
 (0)