How to Implement a Disaster Recovery Plan

System failures are inevitable. Third-party service providers get hacked. Files can get corrupted. Hard drives and servers fail. A disaster recovery plan is crucial for minimising lost data at the least. At its most important, it keeps a business alive.

Because in today’s tech-centric times, loss of data can be catastrophic, whether it’s a result of human error or a cybercrime attack. Proactive action is key. The sooner you implement a plan, the faster you bolster your business against fatal downtimes.

But implementation can take a backseat to daily operations. If you’re in IT, drumming up urgency in your organisation might feel like an uphill struggle. Disaster recovery can also seem like a nebulous concept for many small to medium businesses, which leads to poorly structured plans or worse, no plan in place at all. Below are the least disruptive and most time-efficient ways to do it.

Include the Entire Organisation

Many think of disaster recovery as solely the responsibility of the IT department. However, the best plan won’t leave the work to your technicians while everyone else twiddles their thumbs. Include employees in your plan–what should they immediately do in the event of IT failure? How can work continue in case restoring the system takes multiple days?

Getting the board or top management’s support will also underscore the importance of the plan, and help ensure that everyone treats implementation with urgency.

Document your Cloud Inventory and Apps

We store so many things on the cloud that your average employee doesn’t really think twice about the process anymore. But keeping an active watch over data you hand over to third-party software is important. Your recovery plan should include rolling backups of information you process through third-party platforms. This way, should you lose access for any external reason that’s out of your control, like the service provider going down or being hacked, you’ll still have local access to your mission critical data.

Assign a Team or Point Person

Every day an organisation spends without a DR plan in place is a day they risk catastrophic data loss. However, many businesses seem to underestimate the threat. Unless the disaster is actually knocking on your door, breaches or sudden data loss may seem like an event that happens to other organisations.

Yet businesses can’t really afford to drag their feet. Nearly every business who suffers data loss longer than 10 days go bankrupt within a year. Avoid becoming a part of the statistic by assigning an implementation manager and setting clear deadlines. Regular progress checks and establishing accountability will ensure the ball isn’t dropped somewhere along the daily operations of your business. Creating an incident response team that can carry out the plan once disaster hits can also significantly reduce the damage done by data breaches.

Outline Crisis Communication Lines

Effective communication during an event can help mitigate damage and give your business a higher chance of recovering. List down important contacts, such as stakeholders or relevant individuals in your IT department. Make sure employees have their own task lists for protecting their data.

Create a plan for how to contact people in the event that phone lines go down or your office building becomes physically inaccessible. Your strategy should also include policies for communicating down times to customers and the media.

Asses your Plan using Relevant Metrics

There’s no one plan that applies to all when it comes to disaster recovery. The specifics will shift based on many factors, such as the size of your workforce, or the software you use, and the volume of data you need to recover.

There are ways to figure out if a plan is acceptable or not. When measuring your disaster recovery plan you’ll want to look at your Recovery Point Objective (RPO) and Recovery Time Objective (RTO).

RPO

Your RPO is the time from your last backup. For instance, if you backup your systems every 5 hours, then you’ll lose 5 hours of work in the event of a data disaster.

RTO

The RTO is the time it takes to restore your systems back online. Recovery times can be as fast as minutes, or can extend to as long as a couple of days, depending on the severity of a data loss event.

These two metrics will decide if your plan works for your organisation. Some businesses may be able to survive losing a day’s worth of data, or being down for more than a couple of hours. But if your business can’t, then these constraints need to be communicated clearly so the plan can meet the organisation’s needs.

Your RTO and RPO may also change based on function. For instance, you may need your mail servers restored faster than your CRMs. Grouping applications by importance can help make recovery more manageable.

Test your Plan

No plan is going to be foolproof. However, there are ways to minimise the events your strategy can’t adapt against. This is where testing comes in. Just like fire and earthquake drills, data recovery testing can uncover flaws or miscalculations in the process before they do actual harm.

Testing isn’t a one and done affair. Your IT infrastructure is constantly evolving in complexity. So are the threats. Ideally organisations will want to run maintenance checks every three to four months. This doesn’t have to be a disruptive exercise–it’s possible to conduct testing in smaller, controlled batches so as not to disturb regular operations.

ALL ARTICLES