“A Netflix outage is annoying, but if you can’t access medical records, it could be life threatening.” — Barry Runyon, Research Vice President, Gartner
After a virus shut down their information technology systems in the spring of 2016, MedStar Health refused to admit patients to ten hospitals and 250 outpatient centers. Without easy access to patient records and lab test results, it was too high of a safety risk to treat them. This story is just one illustration of the high risks of downtime in healthcare, which are growing as technology becomes more and more woven into day-to-day patient care tasks.
It’s not only the risk to patients’ health that’s a concern but also the danger to bottom line health. In 2016, the Ponemon Institute reported that the average cost of an unplanned data center outage was $740,000, up from $690,204 in 2013 and $505,502 in 2010. Of the 15 industries included in the study, healthcare ranked number three for total costs of unplanned outages. Downtime for businesses most dependent on data was rising faster than average.
As healthcare providers seek to improve the quality of care and boost profitability, the increasing reliance on information technology is unlikely to abate. Electronic medical records (EMRs) and medical practice management software, for example, are now pervasive.
Clearly, downtime in healthcare is risky, costly and on the rise. Aware of the dangers they face, many healthcare providers have considered worst case scenarios and have developed a disaster recovery plan. However, “an ounce of prevention is worth a pound of cure.” As the nation’s healthcare system shifts from one based on curing sickness and disease to one focused on prevention and wellness, our strategies to maximize IT infrastructure performance should follow suit.
Predicting and Preventing Downtime
When your systems go down, you may feel like being side-swiped by a truck. It seems to come at you out of nowhere. In many cases, however, that’s because you don’t have the tools to scan your IT environment and see what’s coming your way. Of course, doing this is more easily said than done. That’s because the issue causing the outage could arise within any technology—servers, storage or SAN. And that’s just the first layer of complexity. Within those technologies, you might have multiple vendors; any one of them could be causing the problem. If you’re monitoring disparate technologies and vendors with a patchwork of tools, whether homegrown, freeware or from each of your technology vendors, you don’t know where to look first. Also, with alert software, you only know there’s a problem once the truck has already hit you. Telling you which server is down after it’s already putting patient procedures on hold is like the pain that tells you that you broke your arm when the truck hit you. You know where the issue is, but it’s too late to stop the suffering.
Instead, you need a tool that monitors your IT environment broadly and deeply across all technologies and vendors 24/7/365. It should cover computing efficiency, resource utilization, server operations, storage environments, SAN and virtualization efficiency. If you keep track of your data for a while, you will gather historical data which allows you to see your regular performance patterns. This baseline information is essential. It enables you to spot deviations from the norm and trends. Aberrations may indicate a problem. A sudden spike in memory usage, for instance, may call your attention to a memory leak that you need to fix before it impacts your business. Trends, such as growth in memory usage, are likely due to changes in your healthcare operations. If you know the trends in resource utilization and can correlate them with what’s happening in your business, you can then make informed decisions about when, for example, to increase capacity.
Since looking at your entire IT infrastructure is a big job, it’s best to narrow your focus by setting thresholds for performance and utilization. Then, you only have to pay attention to limits exceeded. You might, for example, have a CPU threshold of 85% usage. That gives you time to take action before you’re running on empty, putting lives at risk and impacting the health of your bottom line.
With IT infrastructure monitoring tools that give you a bird’s-eye view of your entire IT environment, you can take preventive action, reduce downtime and slowdowns, increase the quality of care you offer and remain as profitable as possible.
 Cost of Data Center Outages, January 2016, Ponemon Institute, Accessed 7/20/16 from http://www.emersonnetworkpower.com/en-US/Resources/Market/Data-Center/Latest-Thinking/Ponemon/Documents/2016-Cost-of-Data-Center-Outages-FINAL-2.pdf