Skip to content

Asserting for DB failovers #35

Closed as not planned
Closed as not planned
@jacobbednarz

Description

@jacobbednarz

I'm using Toxiproxy for our external services and we're now getting ready to do a bunch of DB failover work. To better handle our failovers without dropping queries, we've patched ActiveRecord to catch any MySQL errors, perform a reconnect and then try the query again. I can manually confirm this works by kicking off this script and either toggling the availability of the toxiproxy or DB server manually during the execution.

ATTEMPT_COUNT = 300
puts "==> Truncating the users test_db table"
ActiveRecord::Base.connection.execute("truncate test_db.user")

puts "==> Starting to send MySQL queries"
require 'securerandom'
ATTEMPT_COUNT.times do
  sleep 0.1
  hash = SecureRandom.uuid
  begin
    puts "    [#{Time.now.strftime("%T.%L")}] Inserting #{hash}"
    ActiveRecord::Base.connection.execute("INSERT INTO user (first_name, last_name) VALUES ('test', '#{hash}')")
    puts "    Success!"
  rescue Exception => e
    puts "    [#{Time.now.strftime("%T.%L")}] #{e}"
  end
end

row_count = ActiveRecord::Base.connection.execute("select * from user").count
puts
puts "Attempted writes: #{ATTEMPT_COUNT}"
puts "DB row count:     #{row_count}"
puts "Variance:         #{ATTEMPT_COUNT - row_count}"

However, I'm getting a little stuck when it comes to using Toxiproxy to emulate the failover completing. I first tried:

Toxiproxy[:mysql_master].down do
  User.first
end

It seems our patch works a little too well because it sits here waiting for the MySQL server to come back but it never does as the yield is still running. I then tried to split the enable/disable but still had the same results with the following:

Toxiproxy[:mysql_master].disable
User.first
Toxiproxy[:mysql_master].enable

Which leads me to the following questions:

  • Could you share how your using Toxiproxy with things like DB failovers? Is this something you're able to test similarly to my intention or do you handle it on a per model basis? I essentially need the proxy to only be present for a short period of time but re-enable after the time has passed.
  • The only way I could think of having this work would be to pass another argument to down (and later disable) which would only disable the proxy for a period of time. Is applying a non-blocking timeout to that functionality something you'd consider useful for the library?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions