Graduate Thesis Or Dissertation

 

Learning Topical Social Media Sensors for Twitter Public Deposited

Downloadable Content

Download PDF
https://ir.library.oregonstate.edu/concern/graduate_thesis_or_dissertations/70795c45p

Descriptions

Attribute NameValues
Creator
Abstract
  • Social media sources such as Twitter represent a massively distributed social sensor over diverse topics ranging from social and political events to entertainment and sports news. However, due to the overwhelming volume of content, it can be difficult to identify novel and significant content within a broad topic in a timely fashion. To this end, this thesis proposes a scalable and practical method to automatically construct social sensors for generic topics. The concept of using social media as a sensor for detection of events and news has been proposed in the literature. However, we argue that most of these works do not focus on targeted content detection or they use very basic methods for collecting the topical data for further analysis. This demonstrates a gap in the use of social media as a sensor for high-quality topical content detection that we aim to address via machine learning. In this thesis, given minimal supervised training content from a user, we learn to identify topical tweets from millions of features capturing content, user and social interactions on Twitter. On a corpus of over 800 million English Tweets collected from the Twitter streaming API during 2013 and 2014 and learning for 10 diverse topics, we empirically show that our learned social sensor automatically generalizes to unseen future content with high ranking and precision scores. Furthermore, we provide an extensive analysis of features and feature types across different topics that reveals, for example, that (1) largely independent of topic, simple terms are the most informative feature followed by location features and that (2) the number of unique hashtags and tweets by a user correlates more with their informativeness than their follower or friend count. In summary, this work provides a novel, effective, and efficient way to learn topical social sensors requiring minimal user curation effort and offering strong generalization performance for identifying future topical content.
Resource Type
Date Available
Date Copyright
Date Issued
Degree Level
Degree Name
Degree Field
Degree Grantor
Commencement Year
Advisor
Committee Member
Academic Affiliation
Non-Academic Affiliation
Keyword
Subject
Rights Statement
Peer Reviewed
Language
Replaces
Additional Information
  • description.provenance : Approved for entry into archive by Julie Kurtz(julie.kurtz@oregonstate.edu) on 2016-06-14T22:30:41Z (GMT) No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: bb87e2fb4674c76d0d2e9ed07fbb9c86 (MD5) ImanZahra2016.pdf: 3143360 bytes, checksum: c42b7f7b8992da2e4094f7e4b04c7094 (MD5)
  • description.provenance : Made available in DSpace on 2016-06-15T22:46:19Z (GMT). No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: bb87e2fb4674c76d0d2e9ed07fbb9c86 (MD5) ImanZahra2016.pdf: 3143360 bytes, checksum: c42b7f7b8992da2e4094f7e4b04c7094 (MD5) Previous issue date: 2016-05-25
  • description.provenance : Approved for entry into archive by Laura Wilson(laura.wilson@oregonstate.edu) on 2016-06-15T22:46:19Z (GMT) No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: bb87e2fb4674c76d0d2e9ed07fbb9c86 (MD5) ImanZahra2016.pdf: 3143360 bytes, checksum: c42b7f7b8992da2e4094f7e4b04c7094 (MD5)
  • description.provenance : Submitted by Zahra Iman (imanz@oregonstate.edu) on 2016-05-27T23:04:43Z No. of bitstreams: 2 license_rdf: 1232 bytes, checksum: bb87e2fb4674c76d0d2e9ed07fbb9c86 (MD5) ImanZahra2016.pdf: 3143360 bytes, checksum: c42b7f7b8992da2e4094f7e4b04c7094 (MD5)

Relationships

Parents:

This work has no parents.

In Collection:

Items