Managing Public Health In Age Of AI & Analytics: Could We Avoid Outbreak Of Viruses Like Corona?

Not yet there, but we are getting closer. Analytics, machine learning, and AI promise better public health outcomes.

Granted that we are nowhere close to predicting the outbreak and stopping the spread of viruses such as the COVID-19, the New Year virus that broke out in Wuhan in January 2020 and has now become a global scare.   But the pace at which AI and analytics applied to public health is evolving will get us closer to the day. 

Says Theresa Do, Professor of Epidemiology & Biostatistics at George Washington Univ. in Washington, DC., “With the Coronavirus (now officially named COVID-19) it is important to consider all data elements that can help identify the ongoing threat. Traditional public health data sources and other sentinel data sources such as social media can be mined for disease indicators as well as possible misinformation.” Theresa Do is also the Support Manager for Federal Healthcare at SAS, the analytics company. She adds that analytics and AI could possibly comb through blogs and other information for early indications of a possible outbreak and then verified through hospital admissions with an uptick of patients presenting with a fever, etc.

It’s all about data, massive amounts of data collected and analyzed and presented in real-time. Theresa explains that for cases that present to a hospital, data can be leveraged to identify symptoms and lab results can then be tied to possible routes of exposure. For example, body temperatures are being monitored at airports and those at risk for illness are prevented from travelling are held in quarantine. This data can be streamed in real-time and if a positive case is found, the data from the flight manifest can be quickly used to determine where passengers have been to determine possible transmission.

Analytics can present a better and more informed picture of the probabilities. Which in itself is a good step forward? Says Steve Bennett, Director of the SAS Global Government Practice and Former Director of National Biosurveillance at the US Department of Homeland Security,” It’s hard to create a specific forecast in time, but we can bring data together to tell us where there are conditions that increase the likelihood of something happening.”

Every such outbreak is an opportunity to learn more and faster about the mechanics of how such viruses spread. Steve says that by integrating data about known viruses, animal populations, human demographics and cultural/social practices around the world, AI can help predict hotspots where new diseases could emerge, helping public health officials and policymakers get a head start on preparedness and surveillance before events happen. He adds that scientists estimate that there could be as many as 800,000 currently unknown viruses in animals that could jump to humans. Using data to help predict where those jumps could occur can help us take preventive action and act more quickly to save lives when events do occur.

How does AI shift the paradigm?

The collection, analysis and reporting of traditional public health data have been used for years to track infectious disease and other public health threats. However, much of this process has been a manual and time-consuming process with information on counts of cases being included within a report or document. Such data has no analytic value unless it is presented as a stream of active data. 

Besides being an epidemiologist, Theresa focuses on the aspect of leveraging data through analytics and AI. Explaining the role of AI, she says that AI has changed the paradigm by using technologies such as natural language processing (NLP) to extract information and leverage algorithms that allow machines to identify keywords and phrases in natural language or unstructured text. NLP can also be used to identify data from non-traditional public health sources to provide further insights into public health threats or early indications of threats. 

Additionally, AI can be applied to models on common themes or topics to help us to quickly identify common symptoms among new and evolving public health threats. Moreover, AI can help to automate data analysis, identify patterns and build models on risk factors to help in scenario analysis of transmission. And when it comes to identifying paths of transmission, AI can possibly identify a host and or index case, as well as possible contacts.

Steve refers to the intertwined role of data, statistics, and analytics in modern public health for many years now. He cites the example of John Snow’s famous map of cholera cases clustering around water pumps in London in the 1850s. Says Steve, “As modern surveillance and data collection have matured since, statistics and analytics have been brought to bear on the early warning problem. What’s new about AI is that it unlocks a whole host of new insights that were not previously possible with classical statistics and analytics.” 

The ability to synthesize many non-health data sets (e.g., travel data, school attendance) without needing to explicitly understand the relationships up-front is a hallmark benefit of AI and more specifically, machine learning. With adequate data size, we can simply throw the data into these technologies and ask them to tell us what the anomalies are.

Steve provides a real-life example of a problem that was just flat-out unsolvable without AI. Says Steve, “We can look at the detection of health events in noisy data sets like social media, specifically Twitter. When I was the Director of the National Biosurveillance Integration Center at the Department of Homeland Security, we implemented a pilot project to see if we could detect anomalous disease signals – specifically influenza-like illness – by mining Twitter data. We found that no matter how hard you try, there’s just no way to write enough business rules and keywords to effectively separate out the real health-related tweets from the noise. But applying machine learning, a form of AI made it possible. We didn’t need to know the rules and keywords - we just needed to feed in enough training data to give the algorithms examples of what we knew was and was not an influenza-like-illness related tweet. The algorithms then did the rest, creating a classifier that could detect real health-related tweets with decent accuracy. It’s not possible to get value out of these kinds of data sets without AI and machine learning.”

The Path to Action

Nations and health organizations need to prepare themselves to leverage the new possibilities. There are two aspects to this: the first one is about putting the mechanisms in place for gathering data and the other one is about creating the layer of technology to support. 

Traditionally, to monitor public health, traditional public health data sources such as case reports, electronic health records and laboratory data are used for confirmation. But the data that can be leveraged for public health is constantly evolving. There are other sentinel data sources that can be used to help identify public health threats early on. Furthermore, with the onset of globalization it is important to also be able to tie case data to data of where a confirmed case has travelled to quickly identify, triage and quarantine as necessary to stop the transmission (e.g., flight patterns, immigration to/from countries, hotel information, etc.). 

Theresa adds that genetic information is also important as it can tell us if the public health threat is evolving and how viruses may be linked. It provides insights into a possible host and how the host threat passed to humans.

Steve stresses on the importance of having a solid data management ecosystem and platform where the data can be stored, cleaned, scaled and shared among key stakeholders. He says that speed is critical for response to emerging infectious disease and having a verified “single version of the truth” in terms of the data can accelerate the ability to make good, life-saving decisions more quickly. A data management platform that can handle data at rest (static health and disease reporting), as well as real-time streaming data (travel data, and others) is an essential tool in the fight against these kinds of diseases. So, it’s not just about the data, but also how that data can be stored, shared and used effectively in global collaboration to fight the emergence and spread of disease.


Around The World