Friday, August 17, 2012

SOP Friday: Backups Part 2 - Backup Philosophies and Client Communication

The Ten Core Truths About Computer Backups

This is Part 2 in four part series on backups. See Part 1: Defining Your Client Backup.

This week the topic is backup philosophies and client communication. As you can imagine, I encourage you to have a standard philosophy with all of your clients. Oddly enough, if a client has a strong alternative philosophy - That's Good! Most clients sort of vaguely know that they should have a backup. But they can't really articulate why, or what a good backup looks like.

What is a philosophy about backups? Quite simply, it's a standard set of beliefs or approaches you take. Here are the Ten Core Truths from our backup philosophy (Note: You might not like them all.):

Truth #1: Backup is not a disaster recovery plan.
A good backup is one component in a DRP (disaster recovery plan), but it is not a DRP. It is necessary but not sufficient for a DRP. Therefore, it is more important than a DRP.

Truth #2: We build all backup systems with Disaster Recovery in mind.
This goal keeps us focused on creating robust backup systems.

Truth #3: Good backup technologies don't fail on a regular basis; lazy and untrained technicians fail.
Whenever we look at a technology developed by the "big boy" companies and used widely across the globe, we know that technology is solid. Tape backups, for example, are absolutely the most reliable backup. They fail because hardware is set up wrong, software is configured wrong, technicians don't know how to use the systems they're selling, or human beings fail to switch the tapes.

Tape backups aren't perfect. They are expensive compared to current hard drive technology, and they are slow. But if I drop a ten year old tape off the top of the Empire State Building into a puddle of mud, I'm going to be able to recover 100% of the data. That's not true with a hard drive.

Don't get me wrong: We're moving away from tape. But it's not because tape doesn't work. It's due to speed and price. We know we are moving to less reliable technologies (given that tape is the most reliable). Therefore, we have to be even more careful about the backup systems we create.

Truth #4: You (the technician) must absolutely master the backup technologies you sell and use. 
That means at the hardware level, software level, media level, and the process level. This is one of your muscles of success. You need to exercise it and build up muscle memory so that you make good decisions in a crisis and have a good sense of the resources available to you.

Truth #5: Backup media must be rotated to permanent off site storage for several reasons.
Permanent off site storage means that they never come back. They sit on a shelf or in a vault essentially forever. Why?

a) Backups make a great snapshot in time for legal, financial, and H.R. reasons.

b) Each media should only be used a certain number of times. If they are used continuously forever, the media become less reliable. This is true of tapes and hard drives equally. So that's why we take them out of circulation.

c) Backups go off site in case the office systems need to be recovered in a disaster. This might include a fire, water damage, or another event that makes it impossible to get to the office.

d) The goal of backups -- including monthly off site backups -- is to provide the ability to restore the client systems and data. Our normal preparation is for today, yesterday, and the last week. But it is also important sometimes to go back in time a month or two, or even a year or two.

There are no limits to the good reasons for storing media off site long-term. Theft, fire, flood, and all kinds of things can happen to your office. If they happen to your home or off site storage facility on the same day, then you need to have a good business insurance (and maybe a good lawyer). How can you plan for that?

Truth #6: You should have as many "points in time" as feasible.
This is critical. A basic backup will get you the file you deleted yesterday. A good backup will get you a file from last week or last month. A great backup will get you a file from several months or years back. A perfect backup will get your every version of every file ever created. (That perfect backup hasn't been invented yet, but it's good to think about.)

More than most technologies we deal with, backup systems have a very clear cost-quality relationship. You want something that kinda works? That's cheap. The perfect example is the USB drive brought home from Office Depot for $49. You plug it in and use whateverthehell software is included. You kinda sorta think you can recover a file if needed. But if you can't access that software . . . um . . .?

At the other end of the scale are systems that cost millions of dollars and are the key components of zero-downtime, instant fail over systems. In the middle, and much closer to the low end, are the $1,000-$2,000 systems we tend to put into client offices. From there you can move up to BDRs and cloud backups.

Create a simple checklist for your backup systems. Which points in time do you need to recover? Select ( Yes ) or ( No ) for each:

- One hour ago
- 12 hours ago
- Yesterday
- Three days ago
- Last week
- Two weeks ago
- Last Month
- Last Quarter
- 6 mos back
- End of last year
- 12 months back
- 24 months back
- 5 years back
- 10 years back
- 20 years back
- Other ?

Now consider what your media rotation looks like. What must it look like in order to create the restore points you say you want? Now think about #4 above: You must absolutely master the backup technologies you use. Let's say you're backing up to disc. After three months of backups, what exactly is on each of your backup discs? Are they full images? Copies of files? Versions of files? How many restore points do you have?

(Note: In Part 4 we'll talk about hard drive media and other technologies, their pros and cons.)

If your backup system does you the favor of eliminating duplicate files, does that mean file names or file versions? How exactly does it work? If you have bad sectors on your hard drive, have you lost every version of a specific file? Master the technology.

Note: Many cloud backup systems fail horribly when it comes to restore points. Before you spend your money, learn what really goes on up in that cloud. Verify for yourself. Master the technology. If the only thing you can restore is the latest version of each file, is that good enough?

Truth #7: Use enough media to guarantee the restore points you want.
Ideally, we would like to see at least 6-12 long-term storage media off site in addition to media for the current week. If you have a safe place to store the older media on site, you could bring them back to the office.

Let's say you backup every business day. That's five times a week. Ideally, those are all full backups (we haven't used incremental backups since we moved away from reel-to-reel backups in 1995). So we have five "current" backups off site, plus end-of-month backups permanently off site for the last 12 months. That's a total of 17 media off site. If you have end-of-year backups, then you will have additional media off site.

Remember: The driver of this discussion is the number of restore points you need. Yes, it costs money. It's cheaper than going out of business.

Truth #8: The first media will fail.
This philosophy is true far more often than we'd like to believe. Basically, it amounts to this: Assume that whatever media you use to restore from will be bad. The first hard drive, the first tape, the first cloud backup. Assume something will go wrong.

Clients don't change tapes/discs. Power supplies go out. Discs get corrupted. Databases get corrupted and then backed up. You need to restore from way back before the corruption happened.

Once you assume that you need backup plans for your backups, then you begin to plan at a higher level. Set up the Department of Redundancy Department within your office. Plan for failure; then plan around it. This is your job.

Truth #9: If you don't test your backup, you don't have a backup. If you can't restore from backup, you don't have a backup.
These are together because they're really the same thing. In addition to designing an awesome backup system, you need to design a process for restoring and testing that backup.

You don't need to perform a full restore in order to feel good about your backup. But you do need to restore from each drive that was backed up (e.g., C:, D:, x:), from each medium used (e.g., disc 1, disc 2), from the core O.S. area (system state on Windows machines), and from within key databases (e.g., a few mails within the Exchange database).

That's not trivial. Just like the backup, you need to craft a test restore that verifies you can get the data back where it belongs. This keeps your technicians tuned up on the process and verifies that you don't have hardware or software failures.

We recommend a test restore every month. If you can't do it remotely, plan to go on site.

Truth #10: We can't care more about the client's backup than they do.
Really, we do care more. But we have to tell ourselves that there's only so much we can do. If the client won't buy new media, won't take bakckups off site, won't let us get in to test restores . . . well, that's their choice. They get to decide how to spend their money. If they want the office manager to backup the data for a $12 MM company onto DVDs once a month, that's their decision.

We need to try to communicate to our clients, their employees, and our own employees about the importance of backups. We need to check them daily. And we need to push the client to take them seriously. But if they simply refuse to participate in the protection of their network, there's not much we can do.

On rare ocassions, we have sent a memo to clients saying that we cannot be responsible for the success of their backups because they are not doing the things we outlined. We offer to do those things for them (including change backup devices every day) for a price. Sometimes it works, sometimes not. But when clients just don't care, we have to try not to worry about it.

Client Communications

When you discusss all this with clients, you need to step back from the geekspeak quite a bit. At the same time, you need to thoroughly understand your company philosophy and make sure the client understands.

We get push-back on taking tapes out of circulation from some folks. We get lots of resistance to switching media (tapes or discs). Interestingly enough, we rarely get resistance based on overall price.

It is extremely helpful to have stories you can use to illustrate your points. Collect backup stories. Believe me, I have one or more for every point above. Stories help them connect your philosophy with the world that matters to them.

Remember, most clients see backups as a necessary evil. It costs money but has few tangible rewards 99% of the time. Of course on the day you recover a database from six months ago, it is worthwhile. When the office burns down, a back is worthwhile. When an employee sabotages a system, a backup is worthwhile.

I highly recommend that you write up a one-page memo on your backup philosophy and distribute it to clients and prospects. It might all be "background noise" to most, but it is a real selling point - especially if you emphasize competence. Clients really want a good reason to justify the money they spend with you. Put yourself up against a $60/hr trunk slammer when it comes to backups.

It also helps to get client endorsements. I've got a great one from my long-time client Hank: "Karl saved my business - twice in one year!" That kind of stuff goes a long ways.

Also, here's a great video: How Pixar Almost Lost Toy Story 2.

That's the story about how Pixar lost the Toy Story 2 movie due to bad backups. Pixar. Bad backups for a MONTH. The lesson is: If it can happen to them, it can happen to you.

Take this seriously. Create a philosphy about backups that gives your clients rock solid backup plans that work.

Your To-Do List

First, you need to write out your philosophy about backups. Believe me: Most of your competition has NO philosophy about this critical function. AND almost no one who reads this will actually follow through either.

Use my discussion above as a starting place, but really make it your own. Who do YOU like to see with regard to processes, procedures, off site media, etc.?

Second, you need to go over this with your technicians and make sure they all understand it. This might lead to some discussions or debate. That's fine: It means they're thinking about backups!

Third, you need to implement whatever pieces you do not currently have in place. This might include selling clients additional media in order to make sure they're in compliance with your philosophy.

Fourth, you need to communicate this to clients in some form (as discussed above). Handouts are good. The more professional looking the better.

Fifth, everyone on your team needs to support one another around this policy.

Your Comments Welcome.

- - - - -
About this Series

SOP Friday - or Standard Operating System Friday - is a series dedicated to helping small computer consulting firms develop the right processes and procedures to create a successful and profitable consulting business.

Find out more about the series, and view the complete "table of contents" for SOP Friday at

- - - - -

Next week's topic: Backups 3 - Backup Monitoring, Testing, and Management
and then . . . Backups 4 - Changing Technologies


Now Available: 

Seminar on MP3 Download 

Two hours of audio training - Plus two slide decks in .pdf format. 

Agenda: Project Management in a Managed Service Business and Zero Downtime Migration Strategies.

No comments:

Post a Comment

Feedback Welcome

Please note, however, that spam will be deleted, as will abusive posts.

Disagreements welcome!