Published: Last updated:

Disaster Recovery

A backup counts only once its restore has been tested

Disaster Recovery (DR) is the technical side of crisis preparedness: how data and systems are restored after a catastrophic event (fire, flood, massive cyberattack)? Business Continuity (BC) is the strategic layer: how operations keep running during the outage?

This is about the survival of the business. An IT organisation without a functional and tested DR plan is an irresponsible business risk.

Anti-Patterns: The Fatal Hope

Many companies believe that "having a backup" is sufficient. But when it matters, it often turns out that recovery takes days, critical data is missing, or the hardware needed to restore simply isn't available. Without defined time targets (RTO) and data-loss tolerances (RPO), IT is rudderless in a crisis.

Defined Safety

  1. RPO (Recovery Point Objective): How much data loss is acceptable? (e.g. "A maximum of one hour's work is lost").
  2. RTO (Recovery Time Objective): How quickly do systems need to be back up? (e.g. "Core operations are back online within 4 hours").
  3. 3-2-1 Backup Rule: 3 copies of the data, on 2 different media, with 1 copy stored at an external, physically separate location (Off-site).
  4. Immutable Backups: Backups that cannot be modified or deleted after the fact (protection against Ransomware).
  5. Regular Recovery Audits: A backup that has never been successfully restored effectively does not exist. The real scenario is tested regularly; the cadence depends on business impact, regulatory requirements, system criticality, and change rate (often quarterly).

The Focus: Critical Processes

Not everything needs to be back online immediately. Recovery is prioritised by business impact: sales and logistics first, internal administrative systems second.

FAQ

What is the difference between backup and Disaster Recovery?

A backup is a copy of the data. Disaster Recovery is the plan and infrastructure to turn that data back into a functioning overall system. Without a plan, data is just useless bytes.

Can RTO and RPO be set to zero?

True zero values are rarely achievable in practice and seldom economical. High Availability across multiple regions lowers the objectives significantly, but failover, replication lag, split-brain controls, dependencies, and DNS and client behaviour leave a residual recovery window and a residual data-loss risk. The task is to find an economic balance: what does one hour of downtime cost the business, versus what does it cost to approach these objectives?

References

Ask AI

These links open external AI services, the conversation and its content are sent to their providers.