Characterizing Misinformed Online Health Communities

Memon, Shahan Ali

doi:10.1184/R1/12988244.v1

samemonthesisfinal.pdf (2.6 MB)

Characterizing Misinformed Online Health Communities

thesis

posted on 2020-09-22, 14:06 authored by Shahan Ali MemonShahan Ali Memon

Do vaccines cause Autism? Does drinking bleach cure coronavirus? Is polio vaccination a ploy to sterilize and reduce the population? From a medical perspective,

the definitive answer to all these questions would be “no”. And yet, the web is populated by a cacophony of mixed opinions about these issues triggered by the proliferation of public health misinformation. Health-related misinformation has detrimental effects on the public health, and debunking it is a challenging task. Because these misinformed sub-communities discourage differing beliefs, public health practitioners and policy makers must grapple

with the challenge of penetrating into these communities to disseminate facts or conduct any message-based intervention. Combating the spread of false information by differential promotion or censorship of the content, or by broadcasting facts does not work. Instead, there is a need

to strategically communicate with the misinformed communities. This requires a thorough understanding of what is an effective communication paradigm to debunk

such myths. For an effective message-based intervention, it is imperative to focus on preference-based framing where the preferences of the target sub-community are taken into consideration. These preferences can be defined over two main aspects: (i) who should deliver the message; (ii) what should the message be. Choosing the right messenger(s) requires understanding of how these online communities interact by tapping into their network structures. Choosing the content of the message, on the other hand, requires a thorough understanding of what language choices they make, and how those language choices reflect their non-negotiable social identities. In this work, we identify two different health communities online: (i) vaccination

sub-communities; and (ii) COVID-19 misinformation sub-communities. In the first part of this thesis, we characterize the network, and sociolinguistic variation in the online competing vaccination sub-communities to understand their linguistic choices and motivations. With the emergence of COVID-19 pandemic, the political and medical misinformation has elevated to create what is being commonly referred to as the global infodemic. Thus, in the second part, we first introduce a novel Twitter dataset, CMU-MisCov19 annotated for different COVID-19 themes.

We then use this dataset to characterize the competing COVID-19 misinformation sub-communities. Our analyses show that the competing sub-communities within each part tend to have significant differences in their communication patterns, and that these differences can be leveraged to form better message interventions. We also make our

annotated dataset available for the community to use for further analysis.

Funding

CMLH Center for Machine learning and Health

History

Date

2020-08-13

Degree Type

Master's Thesis

Department

Language Technologies Institute

Degree Name

Master of Science (MS)

Advisor(s)

Kathleen M. Carley

Usage metrics

Keywords

human information behavior natural language processing social and community informatics linguistics

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Characterizing Misinformed Online Health Communities

Funding

CMLH Center for Machine learning and Health

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports