I got some good answers on my question about High Availability and it sounds like this is not on the horizon in the near-future.
The next best thing to HA for me would be providing an expectation about recovery time if our drone instance totally kicked the bucket. I’ve already got chef scripts to automate the process of standing up a fresh drone instance. But I’m not clear if there’s a notion of backup & restore for Drone, either built-in, or something I could hand-roll.
Drone stores all state in the database. So you would setup regular database backups from which you could restore. You would need to consult your database vendor documentation for backup and restore options and tools: