Proper disaster recovery planning adverts a total tech meltdown

December 5th, 2013
Proper disaster recovery planning adverts a total tech meltdown

A few months ago it wasn't a good morning for technology in our organization.   First the text message came to me at 7:00 am stating that, "the office was down" meaning that it was no longer accessible from the internet.  I then received the next text message, "could I check it out since I was the closest one to the office?"   My plans immediately changed.  I cancelled my first meeting of the day and immediately drove to the office.

After some investigation, I found that one of our Dell EqualLogic SANs was down for unknown reasons.   With all the redundancy in the Dell SAN components what I am staring at simply isn't supposed to occur - but yet it did.   Email was down.  Accounting was down.  Shared folders with our files were down.  Our clients were just waking up and some of them would surely have issues and questions.

At CSI we try very hard to be actual users of the technology we sell and support.  It is not just a marketing brochure.   We have found over our 34 years of working with the various technologies of the day that by committing to use what we sell internally we develop an intimate knowledge of what various technologies do well - and what they honestly don't do so well.

Since our office is down if the technology doesn't work, we have incredible personal motivation to figure out how to make it work properly all the time.   But, as happened on that morning, bad things sometimes do happen despite the best of planning, the best equipment, and the best intentions.

Since we use Paladin Email Defense for our email, our office immediately, automatically had access via a web page to ALL incoming emails.   We could read, reply, print, and compose emails.  Within 20 minutes of being in the office we had gone through all the emails overnight and into the morning and read and printed all emails requiring technical follow up from our clients - despite the office being without our primary email server.  If our clients told us something during the meltdown, we knew about it.   We continued to actively monitor our incoming email throughout our outage.

In defending against potential worst case scenarios we use our Paladin Cloud backup service to backup our critical data both in-house and off-site into the cloud with redundant East Coast/West Coast copies.   A quick check of the backup logs showed that the most current versions of the backups did complete successfully the previous night.   If we needed it, our data was protected.

Our excellent CSI systems engineers did further troubleshooting.  They contacted Dell for guidance on our issue.  What Dell told us was that there was a glitch in their system where the automatic, redundant fail over sometimes doesn't flip cleanly.  Knowing that information allowed us to re-establish the full functionality of the office in about about 20 minutes.  We were completely on-line with no data loss.  Dell had some suggestions on a software update to avoid this issue in the future.

To make matters worse, as soon as everything was again running smoothly, our business park suffered a complete power failure.  Fortunately our UPSes kept all our equipment on-line and our standby generator kicked in and within 10 seconds had the office running off of generator power.  Nothing went down.

Despite planning, despite redundancy, despite good procedures two totally unrelated events happened on the same day within a few hours of each other.

However, CSI's disaster recovery plan worked as designed.  Our office continued to function for customer needs and no data was lost.  Our staff was not standing around wondering what to do and when the systems would be back.

Having been through two fires and two floods we are a bit sensitive to disaster scenarios.

If these events happened to you today, what would you be staring at?  Is your data backed up?  Is it backed up off-site?   How would you get your email if you are off-line?   How would you communicate with your customers?   What would your staff being doing during the outage?  How long would your office be down?

CSI has products and solutions to help you design a robust computer network to help keep you doing what you do for your customers by providing you a plan "B" based upon your up-time needs and your budget.

We'd love to discuss with you how to improve the reliability of your network and improve your ability to recover from a disaster.

CSINY_favicon To find out more about how CSI can help you, contact us.


Leave a comment!

You must be logged in to post a comment.