Preserving HB's content for future reference

#1: Post by **jbviau** » September 17th, 2020, 9:44 am

Dan, what happened over on coffeegeek.com (CG) this week made me wonder about our archives here. What measures are in place to make sure HB's content is preserved for future reference? Is the site backed up regularly? Don't mean to pry; I could just use some reassurance!

In brief, CG permanently lost 9 months worth of content due to server issues: https://www.coffeegeek.com/forums/membe ... ews/741207

Such a massive failure here on HB, where the forums are much more active, would leave a larger hole, so to speak.

This sort of thing reminds me of when alt.coffee disappeared (link) and also the abrupt closure of the singleservecoffee forums a few years ago. The latter isn't relevant to my life anymore, but I'd made thousands of posts there back in the day that went <poof>.

#2: Post by HB » September 17th, 2020, 10:58 am

Yeah, I read about that. Ugh.

I've given "what ifs" a lot of thought, and not just as part of running this site. Performance optimization, high availability, and disaster recovery are topics that I deal with in my day job as a hybrid cloud tech evangelist. I've used this site as a way of building skills by putting these principles into practice. For example, I have an automated process that runs each morning that rebuilds this site from backups; I use that build process for my test server, so I'm certain it's correct (and if it fails, I'll notice right away).

Unfortunately, as Mark on CoffeeGeek learned the hard way, backups are potentially worthless if they're not tested. Years ago, I learned that same lesson on a smaller scale, which is why this site is mirrored across multiple servers. As an added benefit, that allows me to test new code on a "live" site and if there's an unanticipated problem, nobody notices. Even if the change goes to full production, backing off a gaffe only takes a few minutes. I cannot count how many times that's happened! The ability to revert to a known working point saves me freaking out.

For those who are curious, this is a blue-green deployment strategy that uses a content delivery network (e.g., Cloudflare) to make these behind-the-scenes machinations transparent.

As for backups, they're done nightly across multiple targets (both cloud-based and hardware based) and the database is replicated across servers in real time. This covers the cases where, for example, there's a harddrive crash bringing down an entire server, my house burns to the ground, or someone hacks into one of the servers and holds it for ransom.

OK, I'll admit the last one is the one that worries me the most, however unlikely it is. But that's just how I am.

Finally, I've thought about the long haul and "human disaster recovery" too. Nobody likes to think about their own demise, but it's a necessary part of living responsibly. Once a year, I do a full rebuild of the HB servers following a "runbook". That's a documented recipe for recovering from a catastrophic failure without my assistance. I morbidly refer to this as the "what to do if Dan gets hit by a truck" runbook. My wife doesn't think it's funny at all and my kids roll their eyes, but it documents what to do.

#3: Post by **TenLayers** » September 17th, 2020, 11:30 am

^^^^^
Wow!

September 17th, 2020, 11:45 am

Dan. Thank you. That's epically prepared. I'm glad the kids rolled their eyes. That means you're doing well.

September 17th, 2020, 12:30 pm

I am a devops lead if you ever want to pick my brain or get help.

For longer term content preservation and large scale disaster recovery some other things come into play.

#6: Post by **civ** » September 17th, 2020, 1:06 pm

Hello:

CarefreeBuzzBuzz wrote: Thank you.

+1

CarefreeBuzzBuzz wrote: ...epically prepared.

That's what you get when you have a Pro at the helm.

Cheers,

CIV

#7: Post by **Peppersass** » September 17th, 2020, 3:36 pm

HB wrote:...or someone hacks into one of the servers and holds it for ransom. OK, I'll admit the last one is the one that worries me the most, however unlikely it is.

Unlikely until it happens. Black swans happen more often than the math says they do.

I've been worrying about ransomware attack's on a client's website. I haven't done much research on how such attacks are carried out, but I can think of relatively simple ways that the site database can be encrypted without you knowing but the site continues to operate until all your backups are overwritten or become way out of date.

One key to preventing this is to use two-factor authentication for any access to the server, code (source and executable) and database. Another is to regularly test the backups so you can detect that you've been attacked before useful backups disappear. The testing interval sets the acceptable loss (e.g., losing a week's or month's worth of posts.) Testing probably isn't enough. If the code's been hacked you have to inspect the database backup to make sure the contents haven't been encrypted. It's probably worth inspecting the live database, too, though hackers might leave that unencrypted and only encrypt the backups until the day they throw the switch to encrypt your live database.

Yes, I'm paranoid. 15-20 years ago I wrote threat models for an Internet-based voting system that used advanced cryptographic technology. The product was developed after the Bush v Gore fiasco, at a cost of many millions of VC dollars. Alas, it passed the security tests but failed in the marketplace. In the present environment, it would be even more controversial than it was back then. If people won't believe in science, they won't believe in advanced mathematics, either.

#8: Post by HB » September 17th, 2020, 6:44 pm

Thanks for the new list of things to worry about.