Social Media Analytics for Stance Mining A Multi-Modal Approach with Weak Supervision

Kumar, Sumeet

doi:10.1184/R1/12352079.v1

sumeetku_phd_isr_2020.pdf (31.32 MB)

Social Media Analytics for Stance Mining A Multi-Modal Approach with Weak Supervision

thesis

posted on 2020-05-21, 22:32 authored by Sumeet KumarSumeet Kumar

People express their opinions on blogs and other social media platforms. As per a recent estimate, interactions on Twitter alone result in over 500 million tweets per

day. The magnitude of this data enables new applications of opinion mining that have previously remained challenging e.g., finding users’ stance (as in pro or con) on topics of interest. However, one of the major barriers to utilizing this amount of data is the cost of hand-labeling examples for machine learning. This barrier is even more apparent in stance mining, as opinions can change overtime and can be about any issues. To reduce the need for hand-labeled data by taking the complex interactions of social media users and their social influence into account, this dissertation

develops semi-supervised methods for stance mining.

Most existing studies on stance mining take a simplistic view that assumes a sentence (like a Tweet) holds a perspective that is independent of the context and the author’s network position. This approach to stance learning leaves three crucial unresolved challenges. First, how do we train stance-learning models on new topics with minimal labeling effort? Discussion topics change fast and new issues emerge, making it difficult to reuse prior labeled data. However, artifacts of social networks like hashtags can give noisy signal about the stance of users. To extract the signal from noise, I develop methods to find useful hashtags by exploiting how users in the pro-group and the anti-group use popular hashtags. Second, how can we use multiple interaction modalities for stance mining? Users opinions are evident in different types of interactions, e.g. tweeting, retweeting or liking. I develop a semisupervised method based on co-training that jointly trains multiple stance classifiers using different interaction modalities resulting in a better stance prediction model. Third, how to leverage users networks for stance prediction? The current approaches

to stance learning ignore important network factors such as the interactions of social media users (e.g., a persons preference can also be known from his friends preferences).

I use the network alignment as one of the training signals to train the stance classifiers. My thesis brings a new direction to the stance learning problem that is grounded in social theory, is more amenable to analyzing activities on social media, and allows effective learning from multiple types of interactions without requiring large amounts of labeled data. By labeling only a few hashtags used in Twitter conversations on a few controversial topics, my approach allows for predicting both the stance of users (as in whether they are pro or con a topic) by over 80% accuracy and

the stance in conversations (as in whether they favor or deny others posts) by over 70% accuracy.

History

Date

2020-05-12

Degree Type

Dissertation

Department

Institute for Software Research

Degree Name

Doctor of Philosophy (PhD)

Advisor(s)

Kathleen M. Carley

Usage metrics

Keywords

social media social networks opinion-mining semi-supervised stance

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Social Media Analytics for Stance Mining A Multi-Modal Approach with Weak Supervision

History

Date

Degree Type

Department

Degree Name

Advisor(s)

Usage metrics

Categories

Keywords

Licence

Exports