-
Notifications
You must be signed in to change notification settings - Fork 11
Migration
See issue: hkn-rails issue #186
When the OCF upgraded its machines from Debian 8 (jessie) to 9 (stretch), it had a transition period for users on the old apphost (werewolves.ocf.berkeley.edu) to migrate their apps to the new apphost (vampires.ocf.berkeley.edu).
The idea was that we would get hkn-rails running simultaneously on both werewolves (jessie) and vampires (stretch), so when the OCF re-routed web traffic from werewolves to vampires, there would be no downtime.
When @jvperrin and I (@jameslzhu) migrated hkn-rails, we created a separate capistrano target migrate, which would target the new apphost vampires in a separate deploy folder ~/hkn-rails/migrate/. (The previous deploy folder was ~/hkn-rails/prod.)
We implemented several workarounds in response to various issues arising from our specific setup on the OCF:
- NFS (Network File System) sharing between
werewolvesandvampires, causing both to share the same files - (Not really related, but useful) Unix socket file binding, where traffic to
hkn.eecs.berkeley.eduis routed to the program bound to the socket file/srv/apps/hkn/hkn.sock(see the apphosting docs). - Service starting / restarting management with
systemd, which due to NFS also shares network files - RVM, ruby version manager, which installs and compiles Ruby on the apphost in our user directory (
~hkn) - Our use of Solr, a Java indexing engine which runs as a separate subprocess from hkn-rails. We write its PID number to a file, which hkn-rails uses to know that Solr is running and which PID to connect to.
NFS, by itself, caused several issues:
- Incompatible Ruby binaries
- The same Ruby binaries were present on
werewolvesandvampires. Because Debian stretch upgraded various system libraries, the Ruby compiled onwerewolves(2.5.0) linked to shared libraries that were not present onvampires. - Solution: we created a Git branch 'migrate', in which we edited the
Gemfileruby version fromruby: '2.5.0'toruby: '2.5.1'. We installed Ruby 2.5.1 onvampireswith rvm, and addedrvmversion config in theCapfileto denote which versioncapistranoshould use when deploying.
- The same Ruby binaries were present on
- Systemd unit file changes
- The systemd unit file, which specifies the hkn-rails script to run at startup, runs only when the host is
werewolves:ConditionHost: werewolves - Solution: in the
migratebranch, the systemd unit file has the host changed tovampires. On the apphost, the service file (~/.config/systemd/user) has been renamed tohkn-rails-migrate.service(to avoid NFS collision withhkn-rails.service).hkn-rails.servicewas enabled onwerewolves, andhkn-rails-migrate.servicewas enabled onvampires.
- The systemd unit file, which specifies the hkn-rails script to run at startup, runs only when the host is
- Solr detection failure
- uh idk @jvperrin do you know how we got around this
- Shared folder inconsistency
- The deploy uses
~/hkn-rails/prod/sharedto share files between releases, i.e. resumes, pid files, configuration. We don't want to lose access to this in the new deploy. - Solution: symlink the new shared folder to the old:
~/hkn-rails/migrate/shared -> ~/hkn-rails/prod/shared.
- The deploy uses
Production deployment today involves checking out the migrate git branch, then deploying to the migrate target with:
bundle exec cap migrate deployWe would like to return to checking out the master git branch, and deploying to prod; this reduces confusion for new contributors, and reduces redundancy in our config. This will require merging all of the changes on migrate into master, as well as updating the server-side configuration through ssh:
- systemd unit renamings (
hkn-rails-migrate->hkn-rails) - Double-checking
shared/folder consistency - Making sure Solr connections still work
- Avoiding downtime (some will be required, to avoid simultaneous bindings to the socket file)
- Check shared folder will be consistent
- Edit deploy.rb to restart hkn-rails.service, instead of hkn-rails-migrate.service
- Edit logrotate systemd files on apphost to restart hkn-rails.service
-
systemctl --user daemon-reload - Merge into master
- Stop hkn-rails-migrate.service
- Delete old 2.5.0 bundler gems (
~/hkn-rails/prod/shared/bundle) - Deploy prod with Capistrano
- Start hkn-rails.service
- Check if working
- Disable hkn-rails-migrate.service
- Enable hkn-rails.service