How resilient do you think your IT systems are?

By Sean Abbott, Country Manager, Zerto ANZ

The regularity of outages are impacting Australian organisations from all sectors, such as telecommunications to financial and retail, provoking an urgent need for IT resilience to be front and centre in business planning.

The spike in outages at major Australian financial institutions in 2018 has also prompted the Reserve Bank of Australia and the Australian Prudential Regulation Authority (APRA) to jointly work on developing standards for reporting the performance of retail banking services. 

And when it comes to data specifically, disasters come in all shapes and sizes, from obvious threats to hidden ones. Organisations today spend a lot of time and money focusing on cyber threats.

The data protection evolution

Data protection is no longer just about protecting data from natural and manmade disasters. Today, companies must be able to recover quickly from the disaster and continue to operate business as usual in the fastest time possible.

The impacts of disaster can be measured in a variety of ways—the downtime of a user app, the lost revenue a company experiences or even the backlash on social media around the inability to pay for the items in your online shopping cart—mean companies must be prepared for each potential catastrophe.  

It’s no longer just about backup or replication, or even just adding redundancy to the infrastructure layers or high availability into the logic layers. It’s about bouncing back.

Achieving resilience

To be genuinely protected today, organisations need to move from the reactive (the traditional notion of data protection) and into the proactive (accounting for planned outages, potential attacks, unplanned outages, and yes, incorporating the reactive traditional notion of data protection). This may seem like a no-brainer, but many IT teams would confess that in the flurry of attending to the day-to-day needs of their organisations, they are constantly just treading water. There is little to no time to proactively plan for outages that can and will happen.

Yet preparing for both anticipated outages and inevitable unforeseen disruptions remains a critical component on the road to IT resiliency.

It is critical to plan for expected outages as even anticipated unavailability can cause a problem for the business: disruptive upgrades, workload relocation, and cloud migrations, all of which are legitimate reasons for downtime, still incur substantial costs to the company for the outage.

The inevitable (yet unforeseen and unanticipated) disruptions require a contingency plan that goes beyond traditional backup. With security breaches and malware infections the norm rather than the exception, data protection must continue to evolve beyond simply preparing for the inevitable failure of data centre systems we’ve seen.

Identify risk

The identification of risk is the first step in remediating potential system weaknesses to ensure a quick bounce back.

As you undertake this journey, however, it’s important to remember that risk isn’t limited to technology alone. Your risk identification activities should also involve processes and people.

Ensure that you’re across what goes on in your IT department and that systems and procedures are well documented so that in the event there’s a change in personnel, someone else in the team is able to get up to speed quickly and pick up the task easily. It’s always good to ensure that the concentration of knowledge is not a risk for your organisation.

Control what you can and design around what you can’t

To work towards being more proactive than reactive, you’ll have to account for the level of control you have relative to the concern you feel. For example, your wide-area links are an external risk that you have little power over; your telecommunications provider controls them. If something goes wrong on the telco network, you’re powerless to fix it and are at the mercy of the provider.

However, variables outside your control doesn’t mean that you’re without mitigation options. You may not be able to control the telco network directly, but you can design around this issue by utilising diverse routing over disparate links owned by different providers. These sorts of decisions are part of IT resilience, too. If you don’t have the reach to fix the real risk (e.g., the WAN link could go down), then you have to design around it (implement redundant WAN links).

You can also look for solutions to potential roadblocks that fall in both categories: those that are inside of your control and outside of it. Technological advancements in disaster recovery and data backup solutions, can aid your company in preparing both for the planned outages and the unplanned. They can also ensure that, by converging your data protection strategy, you are taking a holistic approach to the road to IT resilience.

The art of IT resilience is the orchestration of multiple different processes and technical solutions to protect organisation’s data and applications. The move, first to virtualisation and then to software-defined computing and finally to all forms of cloud computing, has enabled the possibility of a whole different level of resilience. But with the luxuries of cloud computing and the ubiquity of the Internet comes the responsibility of even more practive planning.