When disaster strikes, your business needs a plan in place to ensure you can recover your I.T. services in the shortest possible time. The process of creating a Disaster Recovery plan (DRP) is an essential area of security planning and, in many instances, a mandatory requirement for regulatory compliance.
Due to the destructive nature of some incidents, organizations should never underestimate the significance of a comprehensive Disaster Recovery (DR) strategy. It may well be the only risk management artifact which stands between survival and total catastrophe. As such, organizations must invest the necessary time and resources in creating a plan which is comprehensive enough to ensure the business can recover from any disaster. Any proper DR plan should contain several essential components. It should guide your organization as the crisis unfolds, establish critical business risk criteria, and determine the order in which you restore your I.T. systems.
Disaster Recovery Plan Scope
In recent years, the proliferation of cloud platforms has made it possible to fully replicate your entire environment to a cloud service for DR purposes. However, even with this low-cost alternative, I.T. budgets are limited, and in the event of a DR, running all your systems in your secondary environment may not be cost-effective. Therefore, every DRP should outline the scope which clearly describes which systems form part of your DR strategy. The element which helps you determine this scope is the services catalog.
A DR planning exercise should always start with a complete assessment of all your existing hardware and software assets. Once you have an understanding of your underlying infrastructure, you can then build a catalog of every technology service which runs your business. This service catalog is a crucial element in any DR plan as it contains a list of all your mission-critical services and their relevant dependencies.
With a service catalog, you can create a service map which indicates the dependencies key applications have on the underlying infrastructure as well as the critical connections which exist between your applications. For example, your email solution will not be able to operate unless it also has access to a Domain Name System (DNS) service. Therefore, should your email platform form part of your DR scope, you must ensure you also provide for the relevant DNS service on which it depends.
Once you have compiled a catalog of your essential services, you can then start the DR planning exercise as you now have a list of every application, their criticality to your business, and their dependencies. This information is vital as it helps you plan your restore order in the event of a disaster. It also assists in determining the budget, and which services do not form part of your DR scope.
The primary objective of any DRP is to protect the organization’s revenue stream. If your business cannot generate revenue, then you cannot survive a disaster. As such, your services catalog should identify which systems are essential in ensuring your business can generate revenue and prioritize these. However, even though DR’s primary objective is revenue protection, communication is another element which is crucial during a disaster. Prioritizing your communication systems is essential as it provides you with the capability to communicate with all your stakeholders during a crisis.
Disaster Recovery Strategy
The next component which you need to include in your DRP is the details of your DR strategy. In this section, you will need to elaborate on the technology solutions you will utilize to protect your environment, how you will activate your DRP, and describe how you failover to your secondary site in the event of a disaster.
Part of your DR strategy must include an elaboration on how you intend to protect your revenue generating and communication solutions. As communications is an essential element, a good idea would be to host your email, voice, and messaging solutions with a cloud-based service provider. In this way should your primary location go offline for whatever reason, your communication systems remain fully operational. For your revenue-generating systems, you could create a redundant platform on a cloud service which you could failover to when your primary site is no longer functional. However, the key to any DR strategy is data replication and protection.
Protecting your data is an essential component of any technology DRP. Hardware and software can be rebuilt or repaired, but data cannot be recreated or reconstructed. As such, even if you opt for a high-availability solution with real-time data replication between your primary and secondary sites, you still need offline backups. A DR event which corrupts your primary data, such as a ransomware attack, will also affect your secondary site if you are replicating your data in real-time. Backups form the cornerstone of every DR strategy. They ensure you have a secondary offline copy of your data which you can restore in the event of a disaster.
Disaster Recovery Procedure
The final essential component of a DRP is the procedure which the organization must follow in the event of a disaster. Typically, the DR procedure consists of three distinct phases which are Response, Resumption, and Restoration.
The Response Phase
The response phase of your DRP should detail when the DR plan should be activated, and high-level information regarding your DR strategy. It should also include the names and contact details of the Emergency Response Team (ERT) who will be responsible for managing the elements and actions of the DRP during the crisis.
An additional element which needs actioning during this phase is your DRP’s communication plan. This document should clearly outline how you will communicate the relevant parts of your DR strategy to your various stakeholders. It must detail how the ERT will interact with your employees and contractors as well as your customers and other members of the public. Activating a prepared page on your website during a DR event would be part of such a communication plan. For example, you could publish the contact details of the ERT as well as further information on your DR recovery and restoration efforts.
The Resumption Phase
The resumption phase should include all the details the ERT needs to execute the failover of the environment and every other activity they need to carry out to action your DR solution.
The actions which the ERT team will take during this phase is dependent on your DR strategy. If you have opted for high-availability, they may only need to test that the automatic failover to your secondary site was successful. However, if your revenue-generating systems can remain offline for a specified period and you have no high-availability in place, the ERT may then need to perform the necessary actions to get your systems online from backups.
Naturally, the scale and type of disaster will determine which parts of your DR strategy need activation. If your entire location has been affected by a fire or flood, the DR response will differ from one where you may have only lost a critical system due to a catastrophic hardware failure. As such, a comprehensive DRP should detail possible scenarios, the severity of the impacts they will have on your business, and the relevant parts of the DR strategy which need to be activated.
The Restoration Phase
The final section in your DRP should detail the steps needed to restore your environment to its original state once your primary location is back online.
The actions the ERT will take during this phase is dependent on your DR strategy and the actual systems impacted during the disaster event. Typically, this phase details how the ERT should ensure data integrity and must also include the completion of a production readiness checklist.
Testing is an essential component of any DR strategy. You should test your DR readiness at least once a year. However, depending on the criticality of your I.T. environment, as well as the probability of a disaster occurring, you may need to test it more frequently.
When testing your DR, creating a fictitious disaster scenario and then checking your plan is the recommended approach. For example, you could simulate a flooded data center and then activate a DR simulation based on that, or you could mimic a building fire which would involve the entire business taking part in the exercise. Either way, testing multiple scenarios will help you find any gaps in your plan which you can remedy before an actual disaster occurs.
Hope for the Best, But Prepare for the Worst
Disaster Recovery Planning is an essential function for every enterprise. As disasters could strike at any time, it is vital that you have a plan in place which you can activate to ensure the continuation of your business. The essential components of a good DR plan must include your technology scope which is determined by your services catalog. From this starting point, you can then develop your DR strategy which will then help you formulate your DR procedure. No DR plan is complete until testing has occurred. Conducting scenario-based simulations against your DR plan on a regular basis ensures your strategy is valid and your business protected from any disaster.
JMARK has been helping businesses in Springfield, Tulsa, and Fayetteville increase their success with the help of innovative I.T. solutions for thirty years. The driving force behind everything we do is to help our clients move their business forward and achieve their goals. To learn more about how we can help you protect your business with effective disaster recovery solutions, contact us today.