Causes of data center outages are changing
According to Computer Weekly, nearly half of all data center outages result in a cost between $100,000 and $1 million. Preventing data center outages is a critical part of business continuity planning. However, while power outages used to be the prime cause of unexpected data center downtime, as grids stabilize, the problem becomes more individual and human in nature than the electricity grid.
Instead, IT configuration and network problems are becoming more common, as pointed out by the Uptime Institute's Third Annual Outage Analysis report. The findings revealed that cloud providers and SaaS can add layers of complexity, and moving infrastructure to third parties puts you at the mercy of your vendor's downtimes. However, if you use solid companies like Microsoft's Outlook 365, or Amazon's AWS with reputable infrastructure and reliable failovers, this isn't that big of a concern. It's smaller vendors you have to worry about.
The other big problem pointed to by respondents was human error, per Mission Critical, especially with companies that integrate systems and employees from distributed workforces and remote devices. Untrained employees can cause havoc on a data center if the right training and refreshing courses aren't provided.
According to the report, of those surveyed:
- Nearly half believe that concerns about resiliency of data-center/mission-critical IT has increased in the past year.
- 1 in 6 reported having an outage in the past three years, and pointed to stakeholder vulnerability as a major concern.
- Nearly two-thirds using third-party data services had a moderate or serious provider-caused IT service outage in the last three years.
- Almost half reported outages in the last three years for which a human error was to blame, citing incorrect staff processes/procedures and failure to follow procedure as root causes.
While data center outages caused by power interruption can really only be planned for and prevented with adequate power backups, network and people errors can be planned for and wheels put into motion to proactively address some of the most common causes.
Modern technological options like use of public cloud services and Internet of Things devices put data center networks at the risk of distributed denial of service (DDOS) and ransomware attacks. You can use colocation facilities, blended ISP connections and carrier-neutral data center connectivity options to help guard against such intrusions, and advanced data analytics to recognize potential security gaps.
Train your staff to be aware of and on guard against phishing attempts and social conditioning, and make sure you have protocols in place to prevent errors from ill-trained staff members taking down your data center. All it can take is a single bit of improperly coded software in a patch effort to bring down your enterprise and cause revenues to come to a screeching halt.
Using AI analytics and programmed predictive maintenance and automating as many processes as possible can cut down on the risk of human error while improving productivity and cost efficiency in the day-to-day operations. Ensure that daily operations are properly documented, that you're conducing regular inventory checks of cooling equipment, and completing physical maintenance inspections.
Before hiring a third party vendor to handle your networking needs or host your data center, do a deep background dive and plenty of due diligence. Look at the company and their past clients, and find out of there have been any serious outages related to outsourcing.
Perle can help you with hardware needs as you work to ensure uptime. Read our customer success stories to learn more.