Wednesday, July 08, 2009

The Reality of Offsite Backup

People are extremely defensive about the choices they've made for backup systems. But more and more I can't believe how ill-prepared most consultants are AND how they have fooled themselves and their customers into believing that they have an adequate disaster recovery scenario.

Notice I did not say backup. Most have a fine backup. They can restore a file. Given enough effort, they can restore a database. But they can't recover a client's entire system in a true disaster.


Approximately every seven seconds I hear about another offsite backup strategy.

I know LOTS of people are making a lot of money off of them.

There are a few really great ones out there. But I believe that roughly 99.999% of all "internet backups" are bullshit, useless, and worse.

I am particularly offended by people who spread the bald faced lie that tape backups don't work. I believe that incompetent technicians can't figure out tape backups. They still can't figure out SCSI. But that's not really an indictment against tape. It's an indictment of the incompetent technicians.

When I look at a backup system, I can't help myself: I think about disaster recovery.

Now, I fully admit that we do backups for many different reasons. We need an "ultimate" disaster recovery system. We need a file restore. We need to archive data.

At the end of the day, 99.999% of all online backup services amount to file recovery. When you start to talk disaster recover, they hem and haw and talk about one of two scenarios:

1) Well, you're not going to backup EVERYTHING. This is just for the important data. That means (for example) 5 GB for data, 10GB for line of business DB, and 20GB for Exchange database.

or

2) Ship us a drive. We're all comfortable that FedEx will never lose anything, and no one is willing to take the time to break the encryption.


In scenario #1: I have zero clients that fit into that category. We have 5-attorney offices who are grateful for a little leeway on a 72GB Exchange database.

The concept that you will only back up the "critical" data and not the operating system is absurd beyond words. Isn't the operating system with all the security information, ACL's and SIDs important? Are you going to put in CD#1 and rebuild from scratch while you wait a week to download the critical data?

I'm sorry: What business can even stay in business while your awesome internet-based backup system has them totally OFFLINE for a week?

If a client has a full T1 and can download about 1.1 GB per 1.5 hours, and if they have a very realistic 50 GB of data, then the math is real simple: It will take three days to restore their critical data AFTER you totally rebuild the server from scratch.

Remember: You're not uploading a working image. You've picked and chosen exactly the number of GBs the client is willing to pay for.

Conclusion: disaster recovery is unrealistic under these circumstances. This is an elaborate scheme to make sure file recovery can be accomplished in a reasonable time frame.


Scenario #2 is "ship a drive." WHAT? You've got a flood or a fire and the answer to getting the client back into business is to ship a drive? Under absolute best circumstances, with maximum expenditure of money, that's a 24 hour process. And in a perfect world, that's a drive with all the deltas since the drive was new.

Better.

But it's still 24 hours until you can start recovering the data.


Imaging is another option. If you can get a real time image up to the cloud, that's great. But what actually happens in a disaster?

I don't mean a fan goes out or a drive fails. I mean a disaster: The electricity is out for a week; flood; fire; earthquake; tornado; haz mat spill; etc.

In a true disaster, you do not have access to the building with the server. You need to get the client back in business. Exactly what do you plan to do with that SBS Server that's booted up somewhere on the internet?

Think about your average client. Have you done them any good?

Will you get them back in business in less than a week?

As I said, many people get defensive about this topic. But I believe you need to take a serious look at exactly what you would do if one of your client's buildings burned down today. Today. This morning. First thing.

Do you have a checklist?

Who calls who? Who initiates the recovery? What do you do for hardware? What do you do for communication? What is your realistic time frame to get them operational and making money again? And what's the time frame for completing a total recovery with all systems replaced and up online as they should be?

If you say "we backup 25 GB of the most important data to the cloud," then your client needs to 1) Slap you, and 2) Fire you today.

At KPEnterprises, Backup and Disaster Recovery are absolutely the highest priority. Even though we spend 99% of our time on other things, Backup and Disaster Recovery always come first. The client might be focused on opening word files and having one less spam per day. But none of that makes any difference if we can't get them back in business when the unexpected happens.

In fifteen years, no client of ours has ever lost any data.

We've seen fires, floods, theft, and employee sabotage.

- - - - -

At the end of the day, there is a place for an internet-based backup system. But you need to pick yours very carefully. The crap that's advertised on Leo Laport and Kim Kommando are fine for grandmas and people who don't rely on their data to make a living.

But don't kid yourself that the tool you've chosen is any better.

Start your checklist:

Step One: Fire Destroys the building

Step Two: . . . what do you do?


Please take this very seriously. Do you have a whiz-bang system that really amounts to fancy file recovery?

Or do you have a disaster recovery system?


I'm afraid that one day something will happen in our business and hundreds of SMB business recovery plans will be put to the test at the same time. And, unfortunately, I'm very confident that most of them fail miserably. Most of the systems will get back up eventually, but they won't be up in a timely manner and the owners will lose a lot more money that they should have.

:-)



Visit www.smbbooks.com for books and more!

20 comments:

  1. This is the kind of a thing I used to advise my partners through and eventually just gave the f up. People don't understand the value of offsite backup and they don't sell it as such either.

    "When your office goes up in smoke, your data will be elsewhere so you're all good and you don't have to worry about taking tapes and hard drives home."

    Yeah, but be prepared to be out for a week while the request is put in, drive is shipped, gets there, restore work begins and you're back up and running.

    Enterprises get this, and they backup the granular pieces because offsite backups are for retention control and long term archiving. They are not very useful for DR work unless you're really, really, really savvy. Most aren't.

    That said, we use our offsite backup for both local and remote and replication and granular and archiving. But we do so because we understand the product and have 2-3 other systems in play at all times.

    -Vlad

    ReplyDelete
  2. I agree with your thoughts in principle. For the MOST part any of our customers using an off-site solution have something to complement onsite. For some of our customers an off-site only solution is actually better than the tape they had in play previously. While I always found it absurd, some customers did not have the discipline to facilitate daily backups. Yes, I mean physically take a tape out and put the next day's tape in. In those scenarios having a "File Recovery Plan" is better than a "Who knows what we will get plan". I think the cornerstone to all DR, or otherwise, plans though is being completely open and honest with the customer about expectations.

    ReplyDelete
  3. For those reasons I don't look at online storage (or 'backup') as a primary DR method.

    What I am considering it for is the opposite - as an org that has hundreds of Gigs of large media files that may never be used again (but need to be kept around for 7 years) I am considering getting those damn things off of my network. So that DR plans can worry about the truly critical need it NOW data.

    In the event DR plans need to be invoked - the last thing I need to worry about are those media disk wasters.

    ReplyDelete
  4. I understand your point. I think that the problem often lies with the small business's lack of interest in backup and D.R. I constantly get asked why my solution costs hundreds of dollars per month, when they could use Mozy for $100 a year or something.

    I personally believe that tapes are a lousy option for small businesses. Your point that small business people don't have the discipline or interest to rotate tapes is evidence of the problem. I always preach that BU/DR is a system AND a process. You can have the best system, but the process has to be implemented flawlessly or it will fail.

    Here's my questions for you: we know what you don't like, but how do you like to do it?

    David Dempsey
    Managed Data
    http://www.managed-data.com

    ReplyDelete
  5. The part of the issue you cover is fair. There is more to the issue however and that is where you have "thrown the baby out with the bath water". As the founder of a SaaS Business Continuity Planning methodology and application we have spent 1,000s of hours looking at this issue. Offsite is just that, offsite. It can be a SAS 70 certified data center on the other side of town. That shouldn't take 2 days to get the drives which contain ALL the firm's data with time stamps, but in a regional disaster it may take 3-4 days to get the 5-10 servers they need to start the rebuild.

    Offsite backup done properly is the ONLY way to go. Tape is not a good answer at all.

    The key to getting people to understand is helping them develop a self authored Business Continuity Plan. This will help them understand their vulnerability and the best way to assure their survival and not become a statistic.

    ReplyDelete
  6. As I said . . . people are passionate about this stuff.

    I agree with most of the comments here. Offsite backup needs to be done right. And they need to be part of a much bigger picture.

    For way too many people, offsite internet-based backup is 3rd rate panacea because it's being administered by a technician who clicked a button and doesn't understand anything about how to actually verify that it's working.

    MShipman summed it up: For some folks, you can't convince the client to spend 3 minutes a day protecting their entire multi-million dollar infrastructure. You might be tempted to say "We will care as much about your network as you do," except that every technician I've ever met (except one) cares more about his clients' networks than the clients themselves.

    This expanding farce is possible because clients want to write a check and make the problem go away.

    Nothing against the clients: We as an industry have told them that we'll transfer the liability for their network from them to us for a flat monthly fee. You can't blame the clients at the end of the day. But you also can't sell smoke and mirrors and let the client think everything's fine.

    When the smoke clears and the fire fighters go home, the owner will turn to you and say "When can we back up? We've got a big telepresence seminar scheduled for noon."

    ElliotRoss has a good idea to store off all the crap that most people don't need all the time. I wonder if it would be better to put all that stuff on a hosted SharePoint site for $29.95/month.

    :-)

    ReplyDelete
  7. Pete at Framework IT: Give us your URL.

    And please describe your system in a paragraph or two.

    As for your comments:

    A) You can't dismiss tape out of hand except as ploy to sell a non-tape solution. 1,000 of the Forture 1,000 companies rely on tape. It's not their only backup system, but it's a critical component.

    B) You're completely correct: If anyone has a self-authored DRP, they're going to be a major leg up on actually executing said DRP.

    But here's where it all falls down:

    - Let's say you can't figure out how to configure a tape rotation (and you're unwilling to learn).

    - So you buy a CDP device and never clear the logs. So when recovery time comes around, you find you haven't been backing up for months.

    The same technician who can't figure out tape will not be able to figure out SonicWall CDP.

    So then you go to an off-the-shelf internet backup because someone you've never met mentioned it in a Twitter Tweet.

    It's like the porn industry. You want to believe that this is a 19-year-old hottie, but it's really an escaped shemale convict who's so obese she hasn't gotten off the couch since Clinton was president.

    When the stuff hits the fan, how do I know where my clients' data is, who has access to it, and what their protocols are? As I mentioned before, there will come a day when all this stuff hits the front page of the New York Times -- and I don't want my company called out by name!

    To be clear: There ARE good, legitimate companies doing REAL, high quality internet-based backups. But they are few and far between. And the technicians buying and reselling these services are, for the most part, not doing enough homework on this.

    From your comment, it sounds like you have a system that allows a build-your own system of some kind? (I haven't done the research.)

    More info appreciated.

    :-)

    ReplyDelete
  8. David Dempsey asks a great question: What do I like?

    Ugh.

    We have been extremely reluctant to adopt an internet-based backup strategy for several reasons.

    First, we have to have 99.999% reliability or nothing.

    Second, the data has to be in our control at all time. That means off to cloud in Arizona, Canada, or the Philippines is out of the question.

    Third, we need to be able to restore to a machine within a few hours and deliver a working solution to the client before four business hours have passed. That's very possible and quite reasonable to expect.

    We need to have a relationship with every single person who is going to have access to this data. Period.

    [Some people will say that this is unreasonable. Fine. Tell that to an accounting firm with 14 accountants who handles nothing except critically sensitive data. Tell that to the intellectual property attorney. If they really understand what you're doing with their data, clients want it under your STRICT control.]

    Until very recently, there were two options that were front-runners for us.

    One was to put a SonicWall CDP in our colo facility and duplicate the real-time images from client CDPs. The primary problem with this is cost. Gulp.

    The other option was to set up an HP Storage server and roll our own offsite imaging system.

    We did try to implement the Kaseya offsite backup. But we learned an extremely important lesson: Most clients don't have an internet connection that allows the Kaseya offsite backup to complete. Too slow. Too many resends. Too many lost packets.

    We have not used the Zenith BDR solution because we could not choose the offsite replication partner.

    Thankfully, Zenith Infotech has introduced a product that allows onsite backup images AND allows us to put an identical unit in our colo.

    Unless new information arises, our plan is to implement the Zenith strategy with as many clients as we can.

    :-)

    ReplyDelete
  9. Karl,

    It is always up to people to make any system work properly. nuff said.

    Framework IT www.frameworkitllc.com is a channel only software and methodology development firm and have created an environment that empowers our partners and provides them with the opportunity to help their SME clients develop a 90% self authored Business Continuity Plan with the resellers help. It is a web based solution that uses remote/off-site replication. No tape here.

    FYI-Framework IT was recently named by CRN as an Emerging Vendor of 2009.

    Thanks you for allowing me to post this, and tell Eric hello

    ReplyDelete
  10. Karl,
    Putting the appliance in your colo helps because in most cases, you're local to your client. But what about when there's an extended city wide outages that is longer than the generators at your datacenter can sustain?

    ReplyDelete
  11. There are many variations here that will work. Backup & disaster recovery isn't a one-size fits all topic.

    At our clients we use a blend of solutions. At one, we have Zenith Arcas replicating between 2 office locations. At some others we do local image-based backups that the client takes offsite weekly.

    If cost wasn't a factor I would absolutely host a replication site for my customers. I would have servers there that could do rapid virtualization of their images and I would then provide RDP access to them. I would also pre-contract with a disaster recovery site for office space, phones, fax, etc.

    Not everyone can afford this though. I can sell it easily enough, but if I walk in to an architecture firm that has been hard hit by our recent financial troubles with this they'll just laugh me out the door.

    This is why local consultants acting as Trusted Advisors will never be out of a job... every client has different needs and wants.

    We've all discussed at length the various options. I talked about some today in my blog. It's how we build and deliver our solutions that matters.

    ReplyDelete
  12. Karl, there are several reasons why we don't use tapes anymore. Firstly the tape drives are only as good as the software running the backup and unfortunately I am yet to find a robust, reliable 'tape' backup software. Sure we are able to configure the likes of backup exec, ultrabac etc to work, but then for no reason they just stop with all sorts of errors.

    Secondly, most of our customers are of the mindset that tapes are old technology and therefore it makes us look bad if we recommend them. Thirdly, we have found success using RDX drives.

    Most clients that use tapes, don't actually take them offsite - they just put them in the safe. This is a big no no. Did you konw that tapes will not survive in a safe when there is a building fire? Most 'fire-proof' safes will protect to some extent but the majority of fire proof safes will still heat up inside, enough to melt a tape. Most people don't know this.

    We use Kaseya's backup solution (acronis) and RDX drives and have had the best results we have ever had. Firstly we can recover a server to different hardware in most cases under an hour. Secondly we provide an archiving service for our clients (another billable income source) whereby we take one of their backups each month/year and keep an archived copy in our secure facilities. We have quick easy access to it and it is easy to do and provides another revenue source, plus removes the stress and responsibility for the client.

    ReplyDelete
  13. Catching up on comments here.

    Boogie: OUR colo is good for about a month on generators and has riot-proof windows. But your point is very valid. After 9/11 I fully admit that I can't foresee all the stuff that could happen. We have one client backing up over the internet to Iron Mountain. Costs a fortune.

    The Consultant: Give us your blog link.

    Folks in general: If you're selling body part enhancements, feel free to leave links.

    ReplyDelete
  14. Michael: RDX is a good option. Still a little pricy but kicks butt on restore speeds.

    And don't worry about clients thinking some technology is "old" if they still have a fax machine, copy machine, and a phone system on premise.

    :-)

    ReplyDelete
  15. I agree that most internet based backups are not great for a disaster recovery in a quick fashion. They should be combined with a local backup of some kind. Going to the Internet should be the last resort, not the first.

    If you look at where most medium to large businesses are going, it is away from tape

    They use multiple methods to backup and recover the data they have. For example, If I am smart, I turn on Shadow copy for files if it is a file server, layer 1, If I have a SAN, I can create a snapshot schedule and replicate it (Layer 2), If I have a virtual environment and it happens to be VMWARE, a product such as vision core ESX ranger allows me to replicate data to another VMWARE machine or a disk someplace (Layer 3) or use a product such as acronis and I can restore to a different hardware platform or VMWARE. Finally, if they see need, they might go to tape or a product such as Unitrends as disk based backup system that uses disk I can take offsite (Layer 4). A product such as data domain also can do this

    Some people just don’t understand disk or internet based backup solutions or how to design a true solution with multiple recovery points. You can do this for small businesses and make it affordable. Tape is just a PIA, tapes go bad, the software can be problematic. Also, if you are going to do a restore to a different machine in a disaster, good luck trying to get it to boot without a bare metal backup option. Finally, how often had you have a tape go bad and not able to restore the data? I have seen this numerous times in my career. How often do you test your restores for you clients?


    They idea of backing everything up is good, but in reality are you going to restore a machine that crashed and had some corrupt data, no I would rather put down a fresh OS and restore the data. It is quicker in the long run and cleans up all of the crap.

    I think the other issue is when you talk DR, the discussion needs to me about more than just tape. What happens when a fire takes out your building, what is your plan? Where are your people going to meet, how are your clients and customers going to reach you. If you have multiple systems, which is the most important to bring up first? DR should start with a plan first and data recovery needs to be part of the overall plan. Most small businesses are not going to pay for a full plan, but the discussion should be brought up.

    If a fire destroys you building, how long is it going to take to get hardware and a tape Drive? That is also a 24 hour proposition. What happens if tapes do not go offsite? (Who is in charge of taking them offsite, if you rely on a human the error factor is very high)

    Finally, lets discuss imaging. If I get it in the cloud, it is not that hard to get people to it if I have a good plan. They key again is designing a solution and not just pushing a product or a service.

    What I think the problem is with most techs and small business solution providers is not Internet based backups or tapes. It is the ability to design a solution that fits the clients needs and make sure that solution works. Most are more concerned about how much recurring revenue a solution will provide and spent less time ensuring the solution will work.

    By the way, a good solution that gets rid of tape and has mulitple options is Unitrends, http://www.unitrends.com/. You can tape disks offsite, encrypt them. You can also vault data to a remote location for a total diaster. It is just one of many solutions the market has to offer.

    Also, 1000 of the 1000 Fortune 1000 compaines do not use tape backup. The trend is away from tapes. See this article as it discusses how enterprises are designing backup systems these days.

    http://www.datadomain.com/pdf/DataDomain-CaseStudy-byIDC-Fortune50Financial.pdf

    ReplyDelete
  16. Thanks, Karl. For any that are interested... I started a discussion of methods and technologies at Consulting Notes by Scott Cameron

    ReplyDelete
  17. James: I want to make clear that I'm open to all kinds of stuff. But remember that someone who can't switch a tape is also not going to understand any kind of multi-tiered DR system.

    As Gilbert Arland said, "Failure to hit the bull's-eye is never the fault of the target."

    It takes human beings to make all these things work.

    In the end, a multi-tied system is the best. That means Internet based backups play a role, but are not (by themselves) the answer.

    Thanks, TheConsultant, for the discussion group.

    ReplyDelete
  18. I think no backup plan is complete without have these minimums:

    1. Internal Backup: Volume Shadow Backup (file recovery) = Quick and simple for end user or adminstrator to recover in seconds. This reuqires extra disk storage.
    2. Onsite Backup: NAS or some form of attached storage where you can backup file, database and images for quick onsite recovery or ability to boot from network or attached storage image. There is no way to test and restore a tape solution without physical onsite tech assisting or investing in a mechanical tape changer (ridiculous option).
    3. Offiste Backup: for file and image recovery. This can be transported offsite via internet or pickup and replace courier. If you do internet based backup then you had better have ownership or access directly to the storage server. You must gain physical access to the server for quick dump of data so you can courier it out (this is what we do). Carbonite and these other cheap webbased storage are good in theory but when it takes 6 weeks to upload your data guess how long it is before you can restore it when it is a real disaster?? You guessed it......6 weeks if your lucky!

    ReplyDelete
  19. Working from the wrong angle...

    It seems that most of the opinions on here are throwing technology at the backup function blindly without analyzing business risk first.

    Similar to insurance, you have to find a happy medium between ultra high premiums and shoddy coverage.

    All backup and DR strategies fit well for someone, somewhere.

    Tape is still good for some businesses becuase their tolerance for time to recover is higher...like a higher deductible.
    Also, you can have multiple independant tapes that have complete backups very inexpensively. If you backup NAS dies, then what? You have to troubleshoot it, get it replaced, reconfigured, and in the meantime, no backups. With a bad tape, use the one from the day before.
    Regarding reliability, the tapes and tape systems of today are significantly better than the previous years in terms of reliability...just read the label: Replace annually...and of course, keep it off the sun baked dash of your car. :-)
    As far as the comment about safes, the problem you have run into is that the tapes were stored in a fire rated safe, not a MEDIA RATED safe. If it takes more than a few hours for your fire dept to get there, move.

    Don't get me wrong, I like D2D as much as anyone...that is what I recommend to most.

    Again, the solution depends entirely on the cost vs the risk tolerance.

    People hem and haw about security and encryption, which is important, but they forget about physical security in their own office. Most techs are too busy gushing about the features of their backup/recovery solution to concern themselves with these basics.
    Most small business office servers and backup units can be stolen with a pickup truck, a rock and a sawsall.

    We use Zenith's solution because:
    1. the data is encrypted with a password Zenith does not have when it is written to disk. There is your security.

    2. The data is stored locally providing for a quick restore of files or entire server, or using the built-in virtualization to be up and running fast in the event of a catastrophic server failure.

    3. The data is sent offsite, currently to another datacenter, with the capability of overnight shipping those images pre-loaded on a replacement BDR to an alternate location in the event of a site-wide disaster.

    That covers most of the bases.

    We have customers that use many different solutions...Tape, Tape Library, RDX, D2D, D2D2T, and our BDR devices...and some a combination...it doesn't mean one is better than the other...it is just the challenge of making sure you provide the customer with the one that best fits THEIR business backup and recovery requirements, not YOURS.

    If you got this far, thanks.

    -JT

    ReplyDelete
  20. Good comments Juan. I particularly appreciate the perspective that there's more than one way to get it right. Or wrong.

    ReplyDelete

Feedback Welcome

Please note, however, that spam will be deleted, as will abusive posts.

Disagreements welcome!