Sara Harris is a federal contractor working to support the Division of Fusion within the Office of the Assistant Secretary for Preparedness and Response in the Department of Health and Human Services. She has been with Fusion for over a year and her work focuses on outreach, communication and social media. Sara received her Bachelor’s in Geographic Science from James Madison University and Master’s in Global Environmental Policy from American University. Prior to her time with Fusion, Sara worked on climate change education and outreach with multiple national environmental non-profits.
Background: Twitter has emerged as a critical source of free and openly available information during emergency response operations, providing an unmatched level of on-the-ground situational awareness in real-time. Responders and survivors turn to Twitter to share information and resources within communities, conduct rumor control, and provide a “boots on the ground” understanding of the disaster. However, the ability to tune out background “noise” is essential to effectively utilizing Twitter to identify important and useful information during an emergency response. Methods: This article highlights a two-prong strategy in which the use of a Twitter list paired with subject specific Boolean searches provided increased situational awareness and early event detection during the United States Department of Health and Human Services (HHS), Office of the Assistant Secretary for Preparedness and Response (ASPR) response to Superstorm Sandy in 2012. To maximize the amount of relevant information that was retrieved, the Twitter list and Boolean searches were dynamic and responsive to real-time developments, evolving health threats, and the informational needs of decision-makers. Conclusion: The use of a Twitter list combined with Boolean searches led to enhanced situational awareness throughout the HHS response. The incorporation of a dynamic search strategy over the course of the HHS Sandy response, allowed for the ability to account for over-tweeted information, changes in event related conversation, and decreases in the return of relevant information.
Twitter has emerged as a critical source of information in emergency response operations. Following and reviewing publicly available information posted on Twitter in real-time provides an unmatched source of on-the-ground situational awareness. Situational awareness is defined as “all knowledge that is accessible and can be integrated into a coherent picture, when required, to assess and cope with a situation.”
The micro-blogging site Twitter, created in 2006, allows users to share information, ideas, and opinions in 140 characters or less. Over the past 8 years, Twitter has grown to more than 241 million active monthly users who send 500 million tweets each day.*
The effectiveness of Twitter as a tool to provide situational awareness to emergency responders was demonstrated in multiple large scale disasters including typhoons,
To complement official data sources and reporting, the US Department of Health and Human Services (HHS) Office of the Assistant Secretary for Preparedness and Response (ASPR) started monitoring Twitter for situational awareness in 2009 during the H1N1 influenza pandemic. Since then, ASPR has continued to refine its monitoring techniques and has increased the number of analytic tools used to bring further depth and detail to maintaining situational awareness.
Presently, Twitter is used for supplementing situational awareness in two distinct ways. The first way is by following long-term trends to identify spikes in Twitter conversation that might indicate an emerging public health concern. This form of monitoring is primarily used for long term epidemiological surveillance of diseases such as Middle East Respiratory Syndrome Coronavirus (MERS-CoV) or H7N9 influenza, and requires the use of a Twitter analytics tool that can access historical Twitter data. The second way ASPR monitors Twitter, which is the focus of this article, is event-specific real-time monitoring. In contrast to the long term trend analysis which focuses on identifying deviations from the historical baseline, event-specific real-time monitoring focuses on finding reports on Twitter that provide situational awareness during an emergency response. The search techniques employed by ASPR for monitoring Twitter are meant to filter down to the most relevant information shared in the midst of a disaster. This article will outline the strategy ASPR employed for event-specific real-time Twitter monitoring during the HHS response to Superstorm Sandy in 2012.
Sandy made landfall in the US on the coast of Brigantine, New Jersey as a post-tropical cyclone on October 29, 2012.
Under the National Disaster Response Framework,
As the storm ended and survivors began to assess the damage from the storm, Twitter was inundated with information. Between October 27, 2012 and November 1, 2012 Twitter users sent more than 20 million tweets using the terms
ASPR began monitoring Twitter for indications and warnings of potential public health emergencies as Sandy made landfall in New Jersey on October 29, 2012. For the initial two weeks of the HHS response, Twitter was monitored 16 hours a day every day. The monitoring team consisted of one full-time social media lead analyst (SHS) and ad-hoc ASPR and HHS employees and interns. When possible, monitoring was split into eight hour shifts to make the work load more manageable. Throughout a shift a single analyst would be entirely dedicated to monitoring Twitter searches and lists
During the first two weeks of the HHS response, in addition to monitoring for early reports of public health emergencies, an overview of the trends observed on Twitter was provided for overall situational awareness during twice daily HHS briefings. On November 17, 2012, monitoring was scaled back to regular working hours and daily situational awareness reports were reduced to once a day. ASPR stopped daily monitoring of Twitter for public health situational awareness related to Superstorm Sandy on December 13, 2012.
Advance notice events (e.g., hurricanes) provide responders with crucial time to better prepare for the implementation of a Twitter monitoring strategy. As it became clear that Sandy would be making landfall somewhere along the eastern seaboard of the US, ASPR analysts began implementing a two-prong monitoring strategy. The strategy included the use of: 1) Twitter lists; a curated list of Twitter users usually tweeting about a single topic
These two methods were employed in tandem to retrieve information that addressed an HHS pre-determined list of hurricane Essential Elements of Information (EEIs). The HHS EEIs guide in determining: what information is critical, who is responsible for the data collection, and the frequency of reporting for HHS emergency response operations.
Table 1 includes only the EEI categories that ASPR monitored on Twitter for potential relevant information that would improve HHS situational awareness. The full list of EEI categories is more extensive. Throughout the emergency response, the informational needs and HHS response concerns changed on an almost daily basis as the situation on the ground shifted. Flexibility and adaptability were critical to effectively monitor Twitter for evolving information needs, trends, and situational awareness. Therefore, the monitoring strategy used was driven by two primary factors: changes in key word terminology associated with the event on social media and the informational needs of leadership and field teams. The EEI categories listed in Table 1 were selected for monitoring on Twitter as they were top priorities that had been reported on social media channels during previous incidents.
Status of critical infrastructure (i.e., hospitals, nursing homes, mental health clinics) Status of shelters Property damage in affected area and casualties (including fatalities) Status of medical special needs populations Injury/disease surveillance and outbreaks Mandatory evacuations and relocation assistance Medical assistance required with Urban Search and Rescue Teams Environmental conditions (including contamination) Status of vulnerable populations
HHS and ASPR do not use or rely solely on Twitter for information on any of the EEI categories, as social media (including Twitter) is not always an appropriate source for certain types of information such as the status of deployed personnel or supply needs within medical facilities.
The Twitter list and Boolean searches functioned differently (Figure 1) to filter incoming data from Twitter in order to make it more usable and relevant to the concerns of the ongoing HHS response. The use of the two together provided a more complete and robust picture from Twitter. For example, the Twitter list acted as a wide net with specific sized holes to monitor and capture all EEIs of concern that were openly reported by local news, public officials, and local emergency management who were already reporting on Superstorm Sandy. An added benefit of the Twitter list is that it also captured unknown or unconsidered hazards. In contrast, the Boolean searches used a variety of keywords and limiting or broadening Boolean operators to focus on and extract tweets that included more specific EEIs of concern (e.g., hospital evacuation) from the entire sea of Twitter. On its own, the sole use of a Twitter list would result in a search that was too broad, while using only Boolean searches would lead to results that are too narrowly focused.
As Bennett and colleagues
When the Twitter list was built for an advance notice event (i.e., hurricane), each account was reviewed and verified based on the user’s Twitter profile and Twitter use. Specifically, analysts looked at when the user’s profile was created, how often the user tweets, who is following the user, and historically what kind of information the user tweets. This information was used to verify that real and reliable accounts were added and used. By taking time beforehand to review and verify potential Twitter accounts for a Twitter list, a more trustworthy source of potential information is identified ahead of time thereby allowing analysts to have greater confidence in the information provided via that source during the response.
• Governor • State Emergency Management or Homeland Security Agency • Local/City Emergency Management Agency • State/Local/City Department of Health • Local Mayor’s Office • Local Police Department • Local News Stations/Local Newspapers • Local Journalists
Generally, a pre-made Twitter list for an emergency response would focus on one to two major geographic areas impacted by the emergency. However, with Superstorm Sandy, it remained unclear what areas would be hardest hit until just hours before landfall; models varied, with some citing Washington, D.C. and the Eastern Seashore as the area of greatest concern while others were reporting a direct hit to New York City. Due to this uncertainty leading up to landfall, the Twitter list initially developed covered a large area of the East Coast ranging from Virginia to Rhode Island.
However, within the first 24 hours of landfall, major updates were made to the Twitter list as the storm path became clearer. Twitter accounts from states north of Connecticut were culled from the list, and the number of accounts from states south of New Jersey was thinned, but not eliminated entirely. For example, the National Capital Region (Maryland, Virginia, Washington, D.C.) remained a focal point because the HHS headquarters are located within this area.
The focus of the revised and ever evolving Twitter list became New York, New Jersey, and Connecticut. In the first 24 to 48 hours post-landfall, the number of accounts included on the Twitter list from each of these states grew rapidly. Very few of these additions were the result of actively searching for new accounts. Instead, new accounts were added to the Twitter list during active monitoring, usually through indirect recommendations (e.g., retweeting) of existing list members. For example, if a local news station already included on the list retweeted one of its reporters, who was not on the list and who was live-tweeting an unfolding situation, that reporter would then be added to the Twitter list. Throughout the course of Superstorm Sandy, the locale-specific Twitter list not only provided increased situational awareness, but also resulted in a deluge of new sources of information.
As the HHS response grew longer, the number of accounts reporting Sandy-related information decreased to those reporting on, and often from, the most impacted areas. This reinforced what Palen and colleagues learned from the Red River floods, in that interest and attention to a disaster is sustained by those who are local and most impacted by the event.
The initial Boolean Twitter searches created for the HHS Superstorm Sandy response focused on six general areas of concern from the EEIs: hospitals, nursing homes, shelters, injuries/fatalities, cold-related illnesses, and carbon monoxide poisoning. At the beginning, these early subject specific searches were kept simple by using broad hurricane terminology
(Sandy OR hurricane) AND hospital (Sandy OR hurricane) AND "nursing home" (Sandy OR hurricane) AND shelter (Sandy OR hurricane) AND (sick OR injured OR death OR dead OR fatality OR killed OR died) (Sandy OR storm OR hurricane) AND (snow OR cold) (CO OR "Carbon monoxide") AND (poisoning OR generator) AND (Sandy OR Hurricane)
Abbreviation: CO = Carbon Monoxide
During the course of the response, the Boolean searches were edited to account for over-tweeted information or news. As an example, following multiple hospital evacuations, there was an abundance of stories about nurses evacuating babies in the midst of power outages. As a result, these stories overwhelmed the pre-established hospital search stream. Therefore, to better control the information retrieved, the hospital search was adjusted to exclude the terms most commonly used in the tweets referring to this incident (e.g., “hero” and “babies”). By excluding these terms, the level of “noise” and volume of tweets retrieved via this search was decreased to a more manageable volume. Figure 2 outlines the steps made to adjust the hospital search. These adjustments were made to account for trends within the Twitter conversation.
As the storm weakened and moved north into Canada, further edits and additions were made to the Boolean searches to account for changes in the overall Twitter conversation. Analysts found that as more time passed from the date of landfall, Twitter users referred to the storm less by name. People who had regularly been providing updates via Twitter became less likely to use #Sandy and more likely to simply state a problem or concern with the understanding that they were already talking about Sandy.
Figure 3 depicts the results of a retrospective analysis using the Topsy Pro Public Sector Analytics™ (Topsy) of the use of the words
As Twitter is not easily searchable based upon geography and only an estimated 2% of all tweets on a typical day include geographic metadata,
Throughout the HHS response the Twitter list and all Boolean searches were monitored via a social media dashboard to allow for ease of viewing and streaming of information. This dashboard allowed for real-time streaming of filtered data and was essential to organizing searches and incoming information into manageable streams. Each analyst utilized his/her own individual dashboard with identical searches and a subscription to the Twitter list. Edits and changes to the list and searches were managed by the lead analyst (SHS) and communicated and tracked on a single document.
The monitoring of Twitter using the two-prong strategy outlined above involved significant flexibility and adaptation. When Twitter monitoring was ended for the Sandy response there were a total of 20 Boolean searches followed regularly with additional ad-hoc searches created as needed for further information. The use of this prong of the strategy resulted in the retrieval and flagging of multiple EEI events during the course of the Sandy response. Throughout the HHS response, more than 30 Twitter reports or responses to information requests in addition to daily summaries were sent.
One example that demonstrated the value of developing and using a dynamic Twitter monitoring strategy to enhance situational awareness during Superstorm Sandy was the evacuation of Bellevue Hospital in New York City.
The use of Boolean searches to track the status of healthcare infrastructure was valuable during the HHS response. The evacuation of a major hospital places additional stress on state and local health systems; therefore, early knowledge or awareness of an evacuation (or potential for) could potentially help HHS better prepare beforehand should states request federal assistance. Boolean searches worked well for this EEI category as it was a narrow subject matter with defined concerns. Around 8:45 p.m. EST on October 29, 2012, the standard hospital Boolean search retrieved tweets that the Bellevue Hospital was operating on emergency generators (Figure 5). Using the information provided in the initial tweets, new and more specific Boolean searches were created using “Bellevue” as a keyword to attempt to retrieve further details and enhance situational awareness.
After the creation of more targeted Boolean searches it was determined that Bellevue Hospital’s emergency generators were not working properly and the basement was flooding with water. Bellevue Hospital hovered in a critical situation before fully evacuating its final 500 patients nearly two days later. This example demonstrated that while it was not plausible to monitor every facility or potential incident of concern ahead of time (particularly in high population density locations), the use of broad searches to identify defined topics of concerns with further investigation as necessary has the potential to be a more efficient use of time and energy during a response.
Some EEIs were more easily tracked via the Twitter list due to the broad nature of the subject matter. Disease surveillance and outbreaks is a broad subject that could encompass a variety of different illnesses and concerns. It would have been too difficult to attempt to predict and create Boolean searches for every possible illness that could emerge as a consequence of the storm and subsequent recovery efforts. Instead, the Twitter list was used as a wide net to monitor overall community health in the days, weeks, and months following the storm.
For example, on November 6, 2012, a local New York newspaper tweeted a report of an outbreak of norovirus at an evacuee shelter. This was the first report ASPR retrieved from Twitter regarding gastrointestinal illness outbreaks at evacuee shelters. By capturing this first mention of a disease outbreak via pre-identified source on Twitter and disseminating this information to HHS personnel, HHS field staff increased precautions and was better prepared to mitigate similar outbreaks in facilities where NDMS representatives and USPHS officers were located.
There were several limitations associated with this method of using Twitter for situational awareness. The ability to verify and fully trust information obtained from Twitter and other forms of social media was a concern. Fake Twitter accounts and rumors can quickly inundate Twitter following a disaster. For example, following the Boston bombings in 2013, 29% of the most tweeted content was the result of rumors and fake accounts.
It was critical to have a verification system in place prior to the activation of monitoring in order to avoid perpetuating the spread of rumors and inaccurate information.
However, during the early hours of the HHS response, the verification process used to add accounts to the Twitter list was modified due to significant constraints on time and resources. The modified process focused on the Twitter account profile information such as the profile biography, creation date, number of followers, and number of tweets sent. Unlike during the preparation phase, a review of past tweets sent by the user was not performed.
The level of situational awareness gained from the EEI Boolean searches was only as good as the keywords or phrases they contained. Building good searches required an understanding of the vocabulary associated with the ongoing response and EEIs of concern. It also required a familiarity with the vernacular of the impacted population and geographic area. Analysts had to be willing to immerse themselves in the conversation to gain a better understanding of the needs of the impacted population.
Nonetheless, overdependence on Boolean searches could lead to the exclusion of certain tweets that may contain important information not included in the search terminology. It is important to acknowledge that every event is unique, and that there inevitably will be a certain number of unknown factors that do not fall into the predetermined EEI categories. Broad searches allow room for anomalies or outliers. The search,
As stated earlier, hashtags were not used in the initial set of Boolean searches as the social media dashboard used included hashtags in the search results. However, a short and simple search was created to track the traffic of a few major hashtags (i.e.,
This two-prong monitoring strategy required a staff completely dedicated to monitoring and managing the Twitter list and Boolean searches. Staff limitations were a factor for Twitter monitoring during Superstorm Sandy due to the number of team members deployed as part of the HHS response. In the U.S., emergency managers report shortages of personnel as the number one reason why their local agencies do not incorporate social media in their emergency management activities.
The use of Twitter varies demographically, geographically, and even politically, making the use of Twitter for each incident or event different. As such, every disaster is different and the same search techniques and strategies may not be applicable and relevant for every response. However, having a starting point of pre-established Boolean search strategies based on existing information needs and priorities, a process for verifying Twitter accounts immediately before and during response, and documenting changes made along the way are crucial to making Twitter work for augmenting situational awareness.
The most important lesson learned from Superstorm Sandy was the need for a dynamic and flexible monitoring process and strategy to understand and respond quickly to health needs in the areas impacted by Superstorm Sandy. Search strategies should change as frequently as the unfolding event. The inability to adapt to a changing situation ensures stale and stagnant terminology and search results. Twitter lists and Boolean searches should be used together to maximize situational awareness. The most important information comes from the impacted population, whether news, local government or local citizens. These are the people who care the most about the information being disseminated. The low number of geo-located tweets and inability to filter tweets based on location in many tools can present a problem in locating tweets from directly impacted areas. Therefore, the use of a targeted Twitter list can help to overcome this hurdle while simultaneously providing more trustworthy content. This information can often be used to guide what additional Boolean searches should be created to fish for individual tweets that provide additional details.
In addition, each disaster presents its own unique hazards and challenges. An understanding of the impacted populations and communities will allow for the development of a more thoughtful and complete Twitter monitoring strategy. This includes taking into account the demographics of the impacted area, whether the area is urban or rural, and how active local government is on Twitter. Each of these will have a major impact on the amount and type of information that is available via Twitter throughout the course of the response.
The use of a two-pronged approach to monitoring Twitter during Superstorm Sandy proved beneficial to the HHS response and augmented overall situational awareness of the affected areas. The strategy used accounted for known (i.e., EEI Boolean searches) and unknown informational needs (i.e., Twitter list). Yet, it is not enough to create a Twitter list and some Boolean searches at the beginning of an event and expect a consistent return of useful information. Twitter monitoring during an event involves constantly reacting to trends in the conversation and changing informational needs, and must remain fluid and flexible to prevent the retrieval of irrelevant information resulting from stagnating searches and lists. Investing in the technology, personnel, and training needed to effectively and efficiently monitor Twitter during a response has the potential to result in a high return on investment and inform on the needs of impacted communities.
*At time of writing. The number of active monthly users and tweets sent each day is a fluid number.
The authors have declared that no competing interests exist.