I won't repeat that here, but the short version is very simple: If You Don't Test Your Backup, You Don't HAVE One!
At least you can't prove you have a backup unless you can restore from it.
In our experience, 50% of all backup "systems" with new clients are not working when we take on the client. This has held true with tapes, discs, internet backups, on-site systems, off-site systems, etc.
Why? Well . . . Something goes wrong. Some jobs don't kick off until the previous job finishes. So if that job never finishes, you're done. Of course discs go bad (spinning 24x7x365 can do that). Configurations change and backups don't, so the job is set to backup the old data and not the new. USB ports go bad. NAS devices drop a connection. Ports go bad on switches. Cables get unplugged.
And so forth. There are hundreds of reasons why a backup job can stop working.
Any good backup system will have (accurate) logs about the job, how it's performing, and whether it was successful.
But the ultimate test is that you have to restore from that backup to verify that it worked. Period. Every month. Religiously.
The same holds true for Disaster Recovery Plans -- DRPs.
Here's the bad news: DRPs are even more fragile than backups. By definition, you need a great backup in order to perform a disaster recovery. But you also need many other things. Perfect documentation. Replacement hardware. Tech support for business applications. Communications. Insurance information. Account numbers. A chain of command. Authorization processes for all kinds of subsystems. A triage system for old equipment. And so forth.
Imagine that a client's office was destroyed by fire. You have no equipment. No printers, no server, no desktops, no hard copy files. No nothing.
[Side note: You have to think in these terms in order to create a workable DRP. A failed hard drive is not a disaster; it's a minor annoyance. In fact, a failed server shouldn't be much of a disaster either. A business runs on all the hundreds of things and processes that are inter-connected and used to make the operation flow. A disaster affects the organization itself, not just the technology.]
The client needs to continue taking orders, delivering products/services, paying bills, paying payroll, etc. In other words, they need to do everything they did yesterday. And they're all stressed out. And short-tempered, frustrated, and generally not fun to deal with. They might not have a place to work or the tools to get the work done.
They turn to you because you're the highly skilled technical professional. If the first words out of your mouth are "I don't know what to do," they'll be referring to you as their previous consultant.
So, lesson one, you need at least a basic, rudimentary plan of how you will recover their business when disaster strikes. Do you have DRP information stored somewhere in your PSA system (Autotask, ConnectWise, Tigerpaw)? Is it stored at the client's house or in a safe deposit box? Does the client know where to get this information, or is it stored in their garage, in an unmarked wooden box, next to the Ark of the Covenant?
Just as with a backup, we take great pride in our DRPs. And just as with a backup, we never know absolutely for sure that the DRP works unless we test it.
Please be very clear on this point: A Backup Is Not A Disaster Recovery Plan. A backup is one key component. It should be tested every month. But a backup won't help you with the control software for the welding machine, the impact printers for purchase orders, the customized invoice paper, new chairs, insurance claims, emergency bank loans, rental laptops, and a million other details that you won't be able to think of while pointing to the melted server and wondering whether you'll be able to get that hard drive out.
For most small businesses, a DRP is pretty simple. There's a very small, and obvious, chain of command. P.R. and press relations is minor. With modern backup strategies, temporary systems can be set up in virtual machines or in the cloud.
In other words, getting the client limping along is usually not a huge undertaking that requires teams of specialists.
But to get the client fully functional, there need to be checklists, processes, and a prior agreement on how things will flow. Yes, the owner will likely make changes to the plan in the middle of the crisis. But those decisions should be made with an understanding of what the plan is in the first place!
The only way to make this go smoothly is to . . . .
1) HAVE a Disaster Recovery Plan
2) Walk Through the DRP with the client . . . at least once
3) Test the pieces of the plan that you can test
4) Keep the DRP up to date
The plan does not have to be excruciatingly detailed. In fact, it should be simple enough that you can cover it with the business owner in an hour and have everyone feel like they understand it. No one will read this plan during a disaster. In all likelihood, no one will ever read it except you. But you need to explain it to the client so that you get sign-off for the plan.
The DRP walk-through is a simple presentation, probably with a white board, so that everyone knows what to expect. We all know what to do in a fire. Many offices even practice this. But most offices have no idea what to do after the fire.
The initial walk-through should be with the owner/manager. Then, after revisions are integrated, the entire staff should sit in on a revised walk-through. You're obviously not building up a new office, just talking through it, answering questions, and making sure everyone has heard the process once.
Many pieces of the plan can be tested. The most obvious of these is restoring the server. But you can also do things like have the boss bring in the insurance contact list, the employee phone contact list, etc.
Some pieces of the DRP cannot be "tested" but can still be verified. For example, if you're going to need three specialized printers, you can research what you need, where you'll get it, how much it costs, how to expidite the order, how to place the order, and additional information that's needed. All of this information might change, so it should be checked at least once a year. In a real disaster, you probably won't have access to the office to look on the back of a machine to find out what the model number is.
And, to be honest, you don't personally have to update all this info. You can help the client collect it the first time and then just work with their staff to verify information each year after that. In this way, the client's employees will actually have a little "hands on" experience with what they will be asked to do when the "big one" hits.
- - - -
The bottom line: If you don't test your DRP, you don't have one.
And if you're not offering such services to your clients, you should be. No one else is! Besides . . . once you make friends with their insurance rep, that person will introduce you to more people who don't have DRPs at all.
Join Me In Las Vegas
October 21, 2010 . . . for
Walking Into The Cloud
* Six-Hour training *
Get ready NOW to start making money with Cloud Services in the SMB Space.
Find Out About Migrating from SBS to The Cloud
Registration is only $249!
It seems like articles like this are posted every few months on the various Small Biz blogs ... and I'm really glad they are!ReplyDelete
This is really Cliche but if one person reads this today and goes "Oh, I really should double check our backup status..." then I think a lot of good has been done.