Disaster Planning and Data

jill's picture

Data collection, storage and retrieval are vital functions of any modern business, large or small. In the United States, 99% of roughly 29.6 million businesses are small firms (fewer than 500 employees). We would expect that most of the country’s 18,000 large corporations to have hardened data facilities with bullet-proof business continuity plans – and given the high level of reliance on data integrity, that small businesses would follow suit. But that’s not necessarily the case. For example, Symantec recently published the results of its small and midsize (SMB) Disaster Preparedness Survey that shared what nearly 1,700 SMB companies worldwide had to say about disaster recovery. Applied Research (which conducted the survey) found that over one third (36 percent) of SMBs do not back up their virtual environments, even though they face losing roughly 40% of their revenue in the event of a disaster that disrupts or destroys their data.

And it’s not just the data at stake. Business interruption due to a natural disaster such as a damaging earthquake can erode the company’s public image, cause loss of certification, failure to meet contractual obligations, and of course, disrupt customer service, resulting in reduction of revenue. Awareness of a business’s level of risk from local or regional earthquake hazards such as severe ground shaking, landslides, liquefaction, fires and floods should inform any business continuity and data protection planning process. The OpenHazards Group, Inc. tools are designed to help both homeowners and business owners assess local earthquake damage potential, in order to make informed decisions.

Why not simply move data centers out of harm’s way into safer zones not subject to natural hazards?  Unfortunately, it’s not that easy.  As Randall Stross points out in Planet Google, studies of Internet search queries have shown that users are very sensitive to the response time of the search engine. Response times less than 0.4 seconds are routinely expected, and response times in excess of 0.9 second produce a noticeable reduction in queries. For reasons like that, data centers valued in the hundreds of millions of dollars are typically located near large population centers, thus putting them at risk for the same natural hazards faced by humans. An example is the Pacific Rim (“ring of fire”), where in excess of 2 billion persons are at risk from seismic disasters.

Where do you begin? IT Consultant Susan Ward says the best defense against a disaster is a business continuity plan that includes proper data protection. That means regular archiving (preferably daily), using reliable media for backups, and keeping updated data in a secure, off-site location.  Using Wide Area Network optimization technology can improve disaster recovery while increasing network response time. WAN technology ensures data still comes through the network, even when it is down.

Metrics specified for any critical business processes should be mapped to underlying IT systems and infrastructure that support those processes. Of course, the budget should be considered -- while most business owners would like no data loss and no time loss, the level of protection required could be too costly. IT experts tend to agree that the best strategy for data protection is to plan ahead with the objective of preventing data loss in the event of a disaster. So in addition to a recovery plan, business owners may consider local mirrors of systems or data, and use disk protection technology such as RAID. Surge protectors reduce the effect of power surges, while an uninterruptible power supply (UPS) or a backup generator can keep systems going when power fails. Fires following earthquakes are common, so the data protection plan should call for alarms or extinguishers (or both). Many businesses choose to subscribe to backup services that are provided by a large data center.  If the center is large enough, it’s likely to be using a mobile solution that can be quickly installed and restore operations.

According to Principal Analyst, Forrester Research Stephanie Balaouras, in a recent article she wrote for Computerworld.com, technology supports disaster recovery preparedness, but it doesn't constitute a strategy or plan. Instead, she says, businesses need to have a framework in place to manage disaster recovery preparedness as a continuous process – not a one-time event. A continuously updated plan can save substantial money over the long run. Put another way, IT service continuity is less a reactive response to catastrophic events and more focus on the nearly continuous availability of IT services. Considerations for the plan may include data availability rate, recovery objectives, the technology prerequisites, and the cost to deliver the service. Balaouras maintains that a common mistake made by businesses when developing IT service continuity strategies is to lead with technology. While it seems burdensome and complicated to conduct a complete analysis with risk management professionals, it's critical. With the results, claims Balaouras, you can identify IT requirements, risks, and impacts to create quantitative justifications for investment. You can find a truckload of information on business contingency planning by searching the Internet, but one particularly helpful source with recommended technologies and practices is the Guide to Contingency Planning, by Jeanne Dininni for Business.com.

OpenHazards Group, Inc. CEO Bill Graves offers recommendations for technology that supports data protection, based on years of experience in the IT industry: 

  • Plan for major earthquakes and other disasters.  Consult sources such as the OpenHazards web site to determine the level of risk your data center might face.
  • Use enterprise quality UPS, with multiple batteries. Graves uses a Matrix 5000 from APC, which has a large isolation transformer and two high quality batteries.  The battery condition is continuously monitored, and the system can be fully tested in situ without taking the computers off-line. His unit is 5 KW, but there are larger ones, and the system should be sized to the anticipated load.
  • A motor generator to back up the UPS. It should have automatic starting in the event of power failure and an automatic transfer switch.  That way, the UPS is backed up and only has to have a few minutes of capacity in the batteries. Cutover should be fully automatic, with automatic recovery, logging of events, and periodic automatic cutover testing. Graves uses a 45 KW generator that runs on tank propane, so an interruption in the gas feed won’t disable the system. His unit is manufactured by Kohler and uses a Ford automobile engine converted to propane fuel.
  • Transient suppression on the inside circuits. The UPS won’t prevent one computer from spiking another adjacent one if it glitches badly. Graves suggests using metal oxide varistor transient suppression between machines.  He does this by using hospital grade suppressing receptacles from Leviton to connect power to the computers.  These are downstream of the UPS.
  • External to the UPS, the motor generator has to support enough other facilities, such as emergency lighting, that can be moved around the building to get things working. Manual transfer switching is OK in this application. Elevators either have to be on automatic transfer to the generator, or key personnel should be restricted from using elevators.
  • Hot swappable supplies with spares for all functional computing devices.  Also, hot swappable hosts. That way, failed units can be replaced without powering the system down and interrupting service.
  • Topologically multiply connect routing between machines.  No single router or link failure should bring down the system. Routers only, no hubs or switches. Sophisticated routers are topologically self-healing, and deliver load averaging over multiple edges of the directed graph. Consider whether to doubly home the hosts.
  • The cooling system must be backed up and redundant. If the A/C or rack fans fail, temperatures will rapidly spike cause automatic shutdown.
  • Clearly marked manual shutoffs for electrical, water, and gas.  Propane or methane sniffers on the floor as appropriate may be coupled to the cutoff switches. Water flow alarms and manual or automatic cutoffs should be installed.
  • Emergency equipment available and clearly marked.  Personnel should be trained to use it. CO2 fire extinguishers, possibly a Halone preaction suppression system, although these are dangerous if personnel are not properly trained. Self-contained Breathing Apparatus (SCBA) equipment should be on hand, including face shields, or respiratory protective masks with shields. Flammable liquids should be properly stored in unbreakable containers and then inside steel cabinets. 

Resources

Comments

hihyjoj's picture

Planning of the disasters and departments of the welfare has been ensured for the infliction of the tings for the humans. Yes, the planning of the problems and the best essay has been made the part of the main department of the society.

Risk Alert