Professor Burkle, MD, MPH, DTM, PhD(Hon.), is a Senior Fellow with the Harvard Humanitarian Initiative, Harvard University, and Visiting Scientist at the Harvard School of Public Health. He is a Senior International Public Policy Scholar at the Woodrow Wilson Center for International Scholars in Washington DC, and Senior Associate Faculty, Department of International Health and the Center for Refugee & Disaster Response, Johns Hopkins University Medical Institutes. He holds Adjunct Professorships at Monash University School of Medicine and James Cook University in Australia and is Adjunct Professor of Surgery, Division of Military and Emergency Medicine, the Uniformed Services University of the Health Sciences in Bethesda, MD. He is the recipient of the prestigious William Crawford Gorgas Medal for "distinguished work in preventive medicine, groundbreaking work in disaster management and humanitarian assistance and the training of an entire generation of U.S. and international personnel. ”In 2007 he was elected to the Institute of Medicine of the National Academies of Sciences. In 2012 he was selected by the National Institutes of Health to give the Joseph Leiter Award lecture recognizing the “best and brightest” in science and medicine. Dr. Burkle has published over 200 scientific articles, 53 book chapters, four books, three on disaster management including Disaster Medicine (1984). He has worked in and consulted on numerous humanitarian emergencies and large-scale international disasters in Asia, Africa, the Middle East, and Eastern Europe. From 2002-03, Dr. Burkle served as Deputy Assistant Administrator for the Bureau of Global Health at the U.S. Agency for International Development and as the Interim Minister of Health in Iraq. A 1961 Saint Michael’s College and 1965 University of Vermont College of Medicine graduate, Dr. Burkle holds post-graduate degrees from Yale, Harvard, Dartmouth, the University of California at Berkeley, University of Geneva, and the Royal College of Surgeons in Ireland. He is qualified in Emergency Medicine, Pediatrics, Pediatric Emergency Medicine, Psychiatry, and holds a Master's Degree in Public Health and Diploma in Tropical Medicine. He is a combat decorated and now retired Naval Reserve Captain, who served with the Marines in Viet Nam, Somalia, the Persian Gulf War, and Iraq. He is a current member of the Board of Directors and Overseer of the International Rescue Committee, the world’s largest refugee organization, the Science Advisory Board of the American Red Cross, and served as Chair of the National Disaster Life Support Consortium of the American Medical Association for 4 years.
Associate Dean of Preclinical Sciences
Background: This article describes a novel triangulation methodological approach for identifying twitter activity of regional active twitter users during the 2013 Hattiesburg EF-4 Tornado.
Methodology: A data extraction and geographically centered filtration approach was utilized to generate Twitter data for 48 hrs pre- and post-Tornado. The data was further validated using six sigma approach utilizing GPS data. Results: The regional analysis revealed a total of 81,441 tweets, 10,646 Twitter users, 27,309 retweets and 2637 tweets with GPS coordinates.
Conclusions: Twitter tweet activity increased 5 fold during the response to the Hattiesburg Tornado. Retweeting activity increased 2.2 fold. Tweets with a hashtag increased 1.4 fold. Twitter was an effective disaster risk reduction tool for the Hattiesburg EF-4 Tornado 2013.
This is Part II of the four part series of articles that analyze the effectiveness of Twitter as a disaster risk reduction tool in mitigating morbidity and mortality.
Tweets were extracted and user region was determined through triangulation followed by GPS and quality assessment validation.
Tweets were extracted from the Twitter database through an authorized Twitter data reseller, GNIP, using filters, and PowerTrack rules that were defined by an 11 day window, February 5, 2013 5:00 pm to February 15, 2013 5:00 pm based at the date and time of the Hattiesburg Tornado February 10, 2013 5:00 pm (
The data extraction was completed requiring a tweet metadata or attached profile of one of a variety of previously used disaster hashtags, keywords, Mississippi cities, Mississippi, or geolocation within the state of Mississippi. Data was returned from GNIP as a JSON format and subsequently turned into a tab delimited file, and secondary filtration was done by a Unix-based Perl script on a Lenovo V570 laptop.
Regional user tweet data was analyzed 48 hours before and 48 hours after the storm with a 2 hour buffer to capture tweets just prior to impact
Pre-Storm
Post-Storm
Total:
Users
3,145
7,501
Tweets
27,927
53,514
Retweets
6,551
20,758
Tweets with a hashtag
5,763
13,598
GPS tweets
758
1,879
Application types
141
192
Verified users
3
3
Languages
6
8
Average:
User account length
785
846
Followers
745
779
Klout
33
34
Friends
506
504
Case insensitive regional filters were developed around the geographical locations, news sources (radio, news, television), and colleges/universities in the area. The filters were then applied to the extracted data based on the users: tweets, biographies, and locations (
User profile categories and the terms that were used to determine the region of a user. Inclusive criteria was captured by more than one term, while exclusive criteria was captured by a single term.
Category
Terms
Inclusive
Exclusive
Tweet:
User Mentioned Regional News
1,938
1,444
Biography:
Mississippi, Alabama, Hattiesburg, Birmingham
845
186
USM, Ole Miss
417
126
SMTTT
327
74
Location:
Mississippi, Alabama, Hattiesburg, Birmingham
4,077
1,826
MS, AL
4,175
1,882
ZIP (for all MS & AL)
42
44
USM, Ole Miss
22
18
SMTTT
1
0
Regional users were confirmed and validated in a two-tier approach based upon the available GPS coordinates and an independent quality assessment. GPS coordinates for users that had activated their geo-locations were compared against regionally defined users to confirm their presence in Alabama or Mississippi (
A research team comprised of an epidemiologist and a masters in biomedical science, independent of the coding protocol, evaluated the data to determine if it regionally met the criteria established in Table 2 and if non-regionally was appropriately excluded. The results found no apparent errors or aberrations of those terms. Regional GPS threshold, regional quality assessment, and non-regional quality assessment was set at a 99% confidence interval and 3.0% margin of error.
The study received an IRB exemption for human subjects research from the William Carey University IRB Committee.
The 11 day span of approximately 5.5 billion total tweets were reduced to 1.1 million tweets per PowerTrack filters.
Data was first evaluated around the 96 hour window of the tornado (Table 3). The window showed 127,954 posted tweets, 26,938 total users, but only 81,441 were regionally defined tweets posted by 8,423 regional users with 515 users having activated their GPS setting in Twitter (Table 3).
Data from a 96 hour window around the tornado impact.
Category
Total
Total tweets
127,954
Total users
26,938
Regional tweets
81,441
Regional users
8,423
GPS users
515
Regional users with GPS
463
Regional tweets with GPS
2,353
Regional confirmed with GPS
2,290
Preliminary evaluation of Twitter users was obtained via filtration sorting through three categories: tweets, biography, and location of users. The relationship between the filters and the collected Twitter users revealed that the terms were mutually exclusive (a single filter detected the user) or mutually inclusive (more than one filter detected the same user) (Table 2).
The first filter category labeled ‘Tweet’ utilized users that specifically referenced the Twitter username of a regional news media outlet (1,938 inclusive users and 1,444 exclusive users (Table 2). The Tweet filter second category labelled ‘Biography’ used the terms Mississippi, Alabama, Hattiesburg, Birmingham (845 inclusive and 186 exclusive users), USM and Ole Miss (417 inclusive users and 126 exclusive users), and SMTTT (327 inclusive users and 74 exclusive users). The third filter category labeled ‘Location’ used the terms Mississippi, Alabama, Hattiesburg, Birmingham (4,077 inclusive and 1,826 exclusive users), MS, AL (4,175 inclusive and 1,882 exclusive users), all ZIP codes for MS & AL (42 inclusive and 44 exclusive users), USM and Ole Miss (22 inclusive users and 18 exclusive users), and SMTTT (1 inclusive user and 0 exclusive users). The further evaluation also revealed that 1,201 users identified as regional did not enter a biography, and 316 did not enter a location.
Time zones from the 8,423 users found to encompass 40 time zones, and 2,268 users did not enter a time zone (Table 4). Central time was listed on 4,268 users profiles, Mountain Time on 680 users, Eastern Time on 618 users, Pacific Time on 200 users, and the other 36 time zones represented 389 users. The variable results of time zones excluded them as being considered as usable regional criteria.
Time Zones
Users
Central Time (US & Canada)
4,268
Null
2,268
Mountain Time (US & Canada)
680
Eastern Time (US & Canada)
618
Pacific Time (US & Canada)
200
Other
389
This was completed in the required 96 hours (4 days) Twitter analysis: pre-storm total users (3,145), total tweets (27,927), total re-tweet (6,551), total tweets with hashtag (5,763), total GPS tweets (758), total Twitter application device types (141), total Verified people (3), total languages (6), average use account length (785 days), average followers (745), average Klout (33), average friends (506).
Post-storm results found total users (7,501), total tweets (53,514), total retweets (20,758), total tweets with hashtag (13,598), total GPS tweets (1,879), total application types (192), total verified people (3), total languages (8), average user account length (846), average followers (779), average Klout (34), and average friends (504)(Table 1) (
"0" signifies the storm impact. "-50" is 50 hours pre-storm. "46" is 46 hours post-storm. This figure displays the usage of tweets by regionally identified Mississippi & Alabama Twitter users.
"0" signifies the storm impact. "-50" is 50 hours pre-storm. "46" is 46 hours post-storm. This figure displays the usage of hashtags by regionally identified Mississippi & Alabama Twitter users.
"0" signifies the storm impact. "-50" is 50 hours pre-storm. "46" is 46 hours post-storm. This figure displays the usage of Retweets by regionally identified Mississippi & Alabama Twitter users.
GPS data was provided by 515 total regional users, and 463 users were found to have tweeted with GPS locations in Alabama or Mississippi in the 11 day span. Of the 52 users that did not have GPS locations, 30 users listed Hattiesburg, MS as their location, and 9 listed locations within Mississippi and Alabama. Including those users who may have been traveling and identify themselves as Mississippi or Alabama locations put the likely regional accuracy of 502 of 515, 97.4%. Inspecting the GPS location of the specific tweets 2,290 of 2,353 fall within the state borders. Closer inspection found 41 of the 63 users who tweeted also had their location listed as Hattiesburg, MS.
Quality assessment of the 900 users showed a 100% validation of no aberrant term usage. Assessing 900 non-regional users found two users that were likely Mississippi or Alabama regional users during the four day window.
The study team successfully developed and validated a novel methodological approach for extracting regional Twitter data despite the anonymity established in popular social media devices. This triangulation methodological approach is designed around the Twitter API or “firehose” in order to provide a real-time or cross-sectional technique to accurately predict user location.
The percent of users that were captured by a single regional term.
Type of usage stayed consistent for both pre-storm and post-storm with mobile phones remaining over 70%; this shows the possible application of using the GPS feature during the time of a disaster and the mobility of this communication media in a disaster period. Prime hours of Twitter usage for the local population were displayed with the peak use times being during daytime hours and diminishing during the night time hours.
Currently, there are limited studies providing an easily accessible methodology to harnessing the power of social media. By being able to provide methodology to local or subset of a population group one can use social media locally and globally in many fields such as but not limited to: public health, information dissemination strategies, sociology, marketing, branding, political science, health information and behaviors, entrepreneurship, business, psychology, criminology, sex trafficking, drug trafficking, relationship cultivation strategies, linguistic studies, population studies, migration pattern studies, behavior studies, and educational technique; all of which were once inaccessible to the lay user.
Twitter’s unique features allow it to become a unique social media tool for emergency management and public health officials for rapid and accurate two-way communication.
The triangulation/ regional approach can be further adapted to real-time solutions based on the presence of an event within a region of the country. For example, bombing events could be regionally isolated based upon certain regional criteria, thereby zooming in on that part of the Twitter data stream to gain real-time analysis. Active and passive surveillance can be enhanced by coupling it with an artificial intelligence-like systems to monitor specific hashtags.
Initial extraction must be performed based on criteria that bias the data to locally specific tweets, but independent access to the full Twitterverse database is prohibited by terms of service. To limit bias, the team utilized a comprehensive broad-based initial extraction. The findings are limited by the scope of time (96 hrs), and nature of the event that was monitored in the region. Regardless the pre-tornado activity was typical of most normal days and activities in the region. The team was was still able to extract and triangulate a total of 81,441 tweets and 10,646 Twitter users, 27,309 retweets, and 2,637 tweets with GPS coordinates (pre- and post-).
Search terms only utilized English and may miss minor misspellings of those terms not caught in the small user sampling. City names were not used outside of Birmingham and Hattiesburg due to corresponding cities in other states. Social media sites often provide anonymity that many users wish to preserve making defining their region impossible. Some users may be on business accounts. Some people view tweets without creating accounts.
User mentions are difficult to ascertain due to the name similarity between users. Pseudo-retweets or users who send a tweet designed to look like a retweet play a small role in the data, but present problems during filtration and extraction. Therefore, the actual population that received and responded to a tweet is likely an underestimate of the true population that received the message. Retweets are identical tweets that were messaged forward onto users. While one cannot truly measure whether an action was taken upon a Twitter communique an acknowledgment that a tweet was received and important enough to relay to its followers, retweets.
This study describes the technically detailed methodological application of the novel triangulation methodology used to filter the haystack of tweets transmitted during the 2013 Hattiesburg Tornado among those captured from the over 2 billion tweets in the 96 hour window of the storm that were emitted on the Twitterverse. The data generated from the approach provides a descriptive analysis of the regional Twitter activity 48 hours pre- and post- Hattiesburg Tornado. By being able to target a subset of a population, rapid information dissemination is possible leading to a potential improvement in morbidity and mortality outcomes in local disasters.
The authors have declared that no competing interests exist.
We would like to thank the following for their invaluable support of this project: James Turner, DO, Dean of William Carey University College of Osteopathic Medicine, Wesley Medical Center, Forrest General Hospital, Sherry Turner, DO, Elizabeth Smith-Trigg, Sarah Middleton, Kyle Hopkins and Forrest County Emergency Operations Center.