IT Disaster Recovery

The challenge of having IT systems ready for the worst

New Disaster Recovery Guidance & Tutorial Site Launched

Posted by eagarg on 29 October, 2008

Well, as you can tell, I certainly haven’t been spending all my time blogging here on WordPress (although I really need to get around to that more often).

What I have been doing is writing some Disaster Recovery tutorials covering various DR related topics, starting with some articles covering the foundations (preparing and planning). Its been more difficult than I expected, however its been a good exercise for me and I’m hoping they’ll really be helpful to others.

So whether you’ve recently been tasked with developing a Disaster Recovery plan for your organisation (and you don’t have a clue as to where to start) or you’ve been involved with DR for years (so you’re an expert but you still frustratingly know that you’ll always be learning and improving), please check out the tutorials and let me know what you think.

The disaster-recovery-guidance.com site also has a few useful articles, a book store (Amazon linked) and something new I’m working on – a customised search using Google technology.

Check it out at www.disaster-recovery-guidance.com

Posted in Business Continuity, Disaster Recovery, Disaster Recovery News, Disaster Recovery Planning, Disaster Recovery Testing | Tagged: , | Leave a Comment »

My first Disaster Recovery Knol

Posted by eagarg on 30 July, 2008

I’ve written my first Google Knol, giving a good overview of the IT Disaster Recovery planning process.

It gives a bit of background on Disaster Recovery (and touches on Business Continuity) and then covers the first steps in the DR process – namely setting RTO and RPO objectives for your processes / systems and selecting a DR stratgegy. The article then also briefly touches on Disaster Recovery documentation and testing.

Check out: An Overview of Disaster Recovery Planning for IT Systems.

Let me know what you think!

Regards,

Gareth

Posted in Disaster Recovery | Tagged: , , , | Leave a Comment »

ITDR.info – IT Disaster Recovery Portal Launched

Posted by eagarg on 7 December, 2006

A new portal providing articles, news and a directory of Disaster Recovery related products and services has been launched.

The aim of ITDR.info is to provide resources to assist Disaster Recovery professionals and system administrators tasked with developing and testing an IT related disaster recovery plan with the resources they need to create, maintain and improve their DR strategy and procedures.

The website builds on the resources that have already been created and made available at itdr.wordpress.com, but provides more articles and also provides DR related news and a directory of related product and service companies.

New material will be regularly added to the site, including DR related tutorials, technical recovery help and at some point in the future, a forum for discussing Disaster Recovery testing.

If you currently subscribe to the itdr.wordpress.com RSS feed, please update your RSS Reader to point to the feed for the new itdr.info site – http://itdr.info/index2.php?option=ds-syndicate&version=1&feed_id=1

If you would like to submit a Disaster Recovery related article to the itdr.info site for consideration for publishing, please e-mail it to articles@itdr.info. Articles must be informative (ie, not just product pitches) but at the end of the article you may add a short paragraph describing your company/product and a link to further information.

Posted in Business Continuity, Disaster Recovery, Disaster Recovery Links, Disaster Recovery Planning | Leave a Comment »

Educating End Users on Disaster Recovery and Business Continuity

Posted by eagarg on 11 October, 2006

When preparing to test your disaster recovery plan, you need to ensure that your users are aware of what you are testing, your objectives and why Disaster Recovery is so important.

This article can be used to provide your end users with an overview of what Disaster Recovery and Business Continuity is and to give them a better understanding of why it is important.

This work is licensed for use under the Creative Commons Attribution 2.5 License. To view a copy of this license, visit http://creativecommons.org/licenses/by/2.5/. You may copy and display this article elsewhere, as well as make derivative works based on this article, as long as you abide by the terms of the license. If using this article, you must credit Gareth Eagar as the original author and reference this website [itdr.wordpress.com].

Read the rest of this entry »

Posted in Business Continuity, Disaster Recovery, Disaster Recovery Planning, Disaster Recovery Testing | Leave a Comment »

Psssttt – looking for books on Disaster Recovery?

Posted by eagarg on 21 August, 2006

THE SUMMARY: Follow the link to http://astore.amazon.com/itdisasterrec-20 and find some good DR related books I’ve selected at the Amazon.com site (it’s easier than searching through the 370 results you’ll get if you search for “disaster recovery” books at Amazon yourself!). If you buy a book [or other product] after following my link, I’d really apprecaite it because you’d be contributing to my book fund!

THE DETAIL: If you need some help or wish to expand your knowledge of Disaster Recovery, there are some great books that can help you do this.

However, if you do a search on Amazon.com for books with a keyword of “disaster recovery” you’ll get 370 results! Some of those are books on disaster recovery, but there are of course a lot of other books in the list that are not focussed on DR. There are also a bunch of articles you can purchase.

To make your life easier [and in the hope of contributing to my own book fund!] I’ve worked through a large part of the list and identified 9 books that I believe may be most relevant. There are some that would be ideal for system administrators looking for help with Disaster Recovery, and others to help DR planners and managers in developing their DR strategy and plan.

As a newly signed-up associate of Amazon.com, I’ll of course make a small commission on any books [or other products] that you decide to purchase after visiting my store on Amazon. That will be a welcome contribution to my book fund so that I can get to read more of the DR books on Amazon. So what I’m saying is I’d really apprecaite you buying through my store if you’re looking for a book anyway!

So, follow this link to http://astore.amazon.com/itdisasterrec-20 and browse the books I’ve selected or search for more books through the links available there. Thanks – I really apprecaite it!

Posted in Disaster Recovery, Disaster Recovery Links, Disaster Recovery Planning, Disaster Recovery Testing, System Recovery, UNIX Recovery, Windows Recovery | Leave a Comment »

Many UK Medical Facilities Affected by Data Centre Outage

Posted by eagarg on 7 August, 2006

On Sunday, 30 July 2006, a problem at a regional data centre caused the loss of the patient administration system for 80 National Health Service (NHS) trusts in the UK.

Apparently there was an issue with the power system that was been investigated by a technical team, when a power surge took out a number of servers. As a precautionary measure, the SAN was shutdown. The site is protected by a high availability fail-over system, but for some reason this system failed.

As a result, the system that provides hospitals and health centres with information on scheduled appointments and planned operations was unavailable for a number of days. Access for 50 of the trusts was restored after 2.5 days (by end of day on 1st August) and the rest were up after 4 days (3rd August). During the outage, the staff at affected trusts reverted to a paper based system.

While various reports suggest that servers were taken out by the power surge and that the SAN was shutdown as a precautionary measure, it appears that it was a failure with the SAN that become the primary problem. The manufacturer of the SAN, Hitatchi, even sent in their own engineers to assist with the recovery.

The details of the problem and resolution have not been made public, so it is unclear why it took so long to get the SAN up and running again. At the time of the incident, one statement indicated that each system, or perhaps disk on the SAN, was being thoroughly tested before being reintroduced into the live production environment.

The data centre in Maidstone where the failure occurred is run by CSC and the patient administration system is run by CSC Alliance, which consists of CSC and various other companies that are the Local Service Provider for the North West and West Midlands Cluster of NHS Connecting for Health.

To get a feel for the size of the outage, a press release by CSC Alliance on 11 May 2006 boasted of the implementation of the Patient Administration System (PAS) across three hospitals which are responsible for treating half a million patients a year across an area of 1,000 square miles. At the time, they indicated that ‘five years worth of data for more than 500,000 patients’ had been transferred to the new system. This press release related to the patient administration system for one large trust – this outage affected 80 trusts.

The CSC Alliance contract to provide services to the NHS is worth around US$1,855 Million over 10 years [GBP 973 Million]. Reading some of the comments in response to this story at various sites shows a lot of people who are unhappy at their tax money being spent on a very expensive project like this that can’t even provide acceptable uptime.

Currently CSC Alliance is not releasing any details on what went wrong, but I am sure that a lot of people are asking questions about their Disaster Recovery plan considering that the fail-over site failed. When last was their DR plan tested? What made this situation so unique that the fail-over site could not be activated?

The fact that CSC Alliance are not giving details on what went wrong leads me to conclude that they had not planned properly for something, had not tested recently enough, had not had strong enough change control during the planned UPS maintenance event or something else that does not look good for them.

The truth is that you cannot plan for every possible scenario and that things do sometimes go wrong, but if I was in their situation and I felt good about my DR planning and testing, I would have been telling the world about my recent successful recovery tests while admitting that Disaster Recovery planning cannot prepare you for every possible scenario and that new lessons had been learnt through this event.

Posted in Disaster Recovery, Disaster Recovery News, Disaster Recovery in Real Life | Leave a Comment »

How good are your DR backups?

Posted by eagarg on 22 July, 2006

When preparing your IT systems for disaster recovery, your backups are obviously a central and critical component (unless of course you have replicated disk off-site, but even then you should probably still be taking local backups). Here are some questions to ask yourself regarding your backups to see if you really are prepared to use them for disaster recovery. Read the rest of this entry »

Posted in Disaster Recovery, Disaster Recovery Planning, Disaster Recovery Software, System Recovery, UNIX Recovery, Windows Recovery | Leave a Comment »

Symantec LiveState becomes part of BackupExec Family

Posted by eagarg on 19 July, 2006

According to an article on ChannelWeb, the Symantec LiveState product has now become part of the BackupExec family of products (BackupExec was a Veritas product prior to Symantec purchasing Veritas recently). As part of the move, the name of the product will change from LiveState to Symantec BackupExec System Recovery.

The new software now supports functionality to allow an administrator to push the BackupExec System Recovery agent to a remote system in order to create that systems recovery points remotely. The pricing of the new version is lower than the previous Live State software.

Read the full article at ChannelWeb.

Posted in Disaster Recovery, Disaster Recovery Software, Windows Recovery | Leave a Comment »

First Steps for System Administrators tasked with DR

Posted by eagarg on 12 July, 2006

So you’ve been tasked to sort out DR for your IT systems and you’re not sure where to begin?

Firstly, Disaster Recovery should be a component of a Business Continuity plan that has been established by the business. The IT systems are just a tool for the business (though obviously it’s a critical tool) and so a full Business Continuity plan should be created that addresses issues such as facilities, human resources, crisis management, public relations and so on.

If you don’t have business buy-in from top management, you’re going to have a tough time and then even if you’re successful, if a major incident hits your organisation, getting IT systems recovered will only be of limited usefulness if the rest of the business cannot continue (if your building is destroyed where are your users going to work from to access the systems you’ve so lovingly rebuilt?)

So let’s say that you do have business buy-in, then your first step is Read the rest of this entry »

Posted in Disaster Recovery, Disaster Recovery Planning, Disaster Recovery Software, Disaster Recovery Testing, System Recovery | Leave a Comment »

Weekly Review of Interesting DR News and Articles [3 - 9 July 2006]

Posted by eagarg on 9 July, 2006

Here is the weekly review of articles and news stories relating to DR that I have found interesting or useful. Read the rest of this entry »

Posted in Business Continuity, Disaster Recovery, Disaster Recovery News, Disaster Recovery Software | Leave a Comment »