One of the most critical aspects of any health disaster situation is reaction time. The faster healthcare workers or first responders can react, the more lives they can save. This is especially true in epidemiology, which is the branch of medical science that deals with the incidence, distribution, and control of disease in a population. When dealing with diseases, especially those that are airborne or spread through touch, reacting quickly can make the difference between a few people being infected in a local area to hundreds or more in a nationwide or global area.
Many people are looking to big data to help improve this response time. If we could harness information from internet searches, social media platforms, and crowdsource GIS information from mobile devices, we could track trends in epidemics and hopefully save more lives. If it can be identified where a disease might migrate, personnel and resources can be sent there.
When considering the ethical implications of using big data, there are two main points.
Public Safety vs. Individual Privacy
First, how do you use the information to benefit the greater good without treading on the individual rights to privacy? Where is the line drawn? This ethical question for health administrators — how do we protect patient privacy but still be mindful of the greater community’s safety? — has been around for a long time, even before HIPAA. However, with this new source of data, the question becomes even more important.
Businesses use the data collected from social media and internet interaction to customize ads to individuals, target specific audiences with their marketing and discover micro areas where they may be under performing. Even though most people don’t read the Terms and Conditions of websites, these uses are usually spelled out in this document. Besides, most people have become inured to these uses of their data.
However, if businesses and websites start giving government and global health organizations all the information about searches and purchases regarding flu or ebola or other health hazards, are they violating their agreement with their users?
Plus, who is allowed access to the data? If information about a possible pathogen goes out to all health providers then it might prevent a missed diagnosis or misdiagnosis on a critical patient, possibly stopping further spread of the disease. However, do law enforcement, pharmaceutical, and insurance companies also get access? They might have an interest in knowing these things, in the long run, but should they get the full data set or just information about patients?
Let’s look at it from the other side: if they have data pertinent to a health crisis but don’t share it for fear of privacy violation, is that ethical?
In 2014, there was an outbreak of Ebola in West Africa that eventually killed over 10,000 people in three countries. While Ebola is extremely infectious (only a minute amount is needed to cause illness), it is not the most contagious of diseases because it is not airborne. Transmission is through contact of bodily fluids or contaminated objects. As concluded in this article by Bill Gates in The New England Journal medicine, if response time had been faster and data of the outbreak more readily available, the death toll might have been much less.
What if it had been something airborne, like influenza? In 1918, the Spanish flu killed 20 to 50 million people worldwide. With the growing concerns of treatment-resistant bugs, it is conceivable for a new strain of influenza to cause an epidemic. If response times are slow in treating and quarantining patients, it could happen again.
If data collected by social media and internet searches could help stop a loss of life at this level, who would complain about privacy issues?
Recording vs. Predicting
The second consideration of using big data concerns the accurate processing of the data. Once all the information is collected it needs to be analyzed. We aren’t looking for a simple record of the epidemic, we need to be able to spot trends in the data. Where the power of big data collection lies is in the ability to more quickly spot an outbreak and then accurately gauge where it is going.
Somehow, the global epidemiology community needs to be given the tools to rapidly deploy personnel and supplies. In the Ebola outbreak, we learned that incomplete data can cause an insufficient response, not seeing where the disease had spread. If we have the data, the next step is being able to quickly and accurately analyze it, then make decisions based on the information. Humans need to make the decisions, but computer modeling should be able to analyze the data.
However, that is apparently not as easy as it sounds. In 2009, Google Flu Trends attempted to do just that. Using data from internet searches, it was designed to predict flu outbreaks. It was able to accurately predict past flu trends from historical data. However, when applied to real time data, it didn’t fare as well. It under-predicted one year and over-predicted another. It has since been discontinued.
If this model was used to make actual decisions, it could have been disastrous. It could result in either health officials being under prepared with too little vaccine and other supplies, or it could lead to widespread panic and hysteria when not warranted.
The benefits of big data to public health and controlling epidemics seems obvious, even if we are not yet able to use it to its full potential. However, if we don’t set some ground rules now about ethical use and collection, there could be abuses of this information in the future. The public deserves to know how and where their information is being used.