A full nine days before Ebola was even recognized by the World Health Organizations as an epidemic there was something else. HealthMap, a software that mines government websites, social networks and local news reports, identified a “mystery hemorrhagic fever” that was going around.
This raised the question: What potential do the vast amounts of data shared through social media hold in identifying outbreaks and controlling the disease?
A San Diego State University professor recently authored a study that shows the connection between predicting potential outbreaks (specifically pertussis and influenza) and social media and data from mobile phones.
RELATED TOPIC: This ebola outbreak might not go away for a very long time
Ming-Hsiang Tsou believes that algorithms that may be applied to tweets and information stored in mobile phones can be used to predict and track outbreaks.
“Traditional methods of collecting patient data, reporting to health officials and compiling reports are costly and time consuming,” said Tsou. “In recent years, syndromic surveillance tools have expanded and researchers are able to exploit the vast amount of data available in real time on the Internet at minimal cost.”
Given the popularity of social media, infectious disease surveillance systems that use data-sharing technologies to accurately track social media data could potentially inform early warning systems and outbreak response, and facilitate communication between health-care providers and local, national and international health authorities.
Social media tracking: Then and now
Currently there are no official national programs for disease surveillance via social media, but several systems are being used as complementary sources of information.
For example, disease detection app Flu Near You helps predict outbreaks of the flu in real time. Users self-report symptoms in a weekly survey, which the app then analyzes and maps to show where pockets of influenza-like illness are located. Flu Near You is administered by HealthMap in partnership with the American Public Health Association and the Skoll Global Threats Fund. The effort is supported with private funds to demonstrate its utility for multiple sectors that work together on pandemic preparedness. The information on the site is available to public health officials, researchers, disaster planning organizations and anyone else who may find the information useful.
“There are real opportunities for using this data that is scattered across the Web in news, blogs, chat rooms and social media,” John Brownstein, HealthMap co-founder and associate professor of pediatrics at Harvard Medical School, told Emergency Management in a recent interview. “We’re focused on collecting all that information using data scraping, machine learning and other processes and combining it into one platform that will enable clinicians, public health practitioners and consumers to see what’s happening.”
RELATED TOPIC: How social media is shaping modern health care
Understanding the accuracy of such information is also important, said Tsou, whose recent study explored the interaction between cyberspace message activity (measured by keyword-specific tweets) and real-world occurrences of influenza and pertussis. Tweets were collected within a 17-mile radius of 11 U.S. cities chosen on the basis of population and the availability of disease data. Tweets were then aggregated by week and compared to weekly influenza-like illness and pertussis incidence. The correlation coefficients between tweets or subgroups of tweets and disease occurrence were then calculated and trends were presented graphically.
“The correlation between the weekly flu tweets versus the national flu data was almost 86 percent,” said Tsou. “It was a very high correlation. Even more interesting is that when we compared our data to data from the San Diego County Health and Human Services Agency, who we partner with, we received even more precise data on weekly flu cases reported through their lab testing. The correlation was 93 percent — even higher than the national level. That was a very encouraging finding.”
But utilizing social media data in this manner also presents challenges, such as correlating a social media post with a specific disease or condition.
“A lot of people tweet that they have a fever or have the flu, but sometimes that information isn’t specific enough for us to connect it with a disease like whooping cough,” Tsou said. “That’s one of the limitations we are dealing with.”
“There’s both a blessing and a curse to using social media in that it’s super rapid, but it also generates huge amounts of noise,” Brownstein said. “Dealing with all the noise and trying to pick out the signals that have meaning is definitely a challenge.”
A world of possibilities for public health
Some public health agencies are already beginning to rely on social media data to investigate health issues.
For example, last year the Chicago Department of Public Health began using Twitter to identify cases of foodborne outbreaks. The department teamed up with a group called Smart Chicago to develop an app that analyzes tweets that reference food poisoning, leading the city to step up inspections and enforcement on offending establishments.
The New York City Department of Health and Mental Hygiene is taking a similar approach. It recently worked with Columbia University and Yelp on a pilot to prospectively identify restaurant reviews on Yelp that referred to foodborne illness.
“These systems are operational, and they are being used by government entities to provide situational awareness,” Brownstein said. “They’re not necessarily the only sources of information, but they are an important source of information.”
But it may still be a while before public health departments officially adopt social media data as a significant element of their regular investigations.