Dissertation
Modeling emerging news stories across digital publications and social media
Doctor of Philosophy (Ph.D.), Drexel University
Mar 2022
DOI:
https://doi.org/10.17918/00001382
Abstract
The complex process of news production, through intense competition for audience attention among outlets, generates content that presents significant interaction challenges to the end user. In addition to the information overload introduced by the transposition of the 24-hour news cycle onto the internet, interaction and intersection of digital news production networks and social network platforms have introduced additionally complex phenomena in the form of novel digital artifacts. One such phenomenon is the interactivity-enabled inclusion, or "embedding'' of user-generated content from social media within digital news articles. The preponderance of such complexity within news stories place great cognitive burdens on users, i.e. news consumers, attempting to process this multifaceted information stream. News aggregators offer an avenue for users to organize and direct their consumption, however, major news aggregators in use today remain rudimentary by simply replicating hyperlinked representations of conventional newspaper-style story organization. This landscape of news retrieval and consumption technologies presents the opportunity to conceptualize and design more intelligent systems and tools that can help reduce the challenging complexity of news consumption through distributed cognition. In this dissertation, I present a multi-year ongoing data collection effort, NewsTweet, that acquires digital news articles, embedded social media content from Twitter, and further potentially newsworthy tweets. I analyze the dataset acquired by NewsTweet to investigate the mechanisms and associated journalistic practices of tweet embedding through the perspective of temporality as well as outlet and embedded user dynamics. Next, I describe a network construction approach from embedded tweets and connected articles, investigate network structures, identify the topical significance of connected subgraphs, and evaluate methods for segmentation of larger connected subgraphs into further topically distinct groups of articles. I introduce, implement, and evaluate state-of-the-art semantic featurization and unsupervised clustering methods for generalized news aggregation. Further, I describe and implement an algorithmic approach for combining embedded subgraph segmentation with semantic clustering so that network structure from granular subgraphs may be preserved within a larger aggregation pipeline. Through a series of experiments, I demonstrate the potential for modular and scalable design of a novel model of news aggregation that constructs hierarchies of stories in the news, which may be extended and deployed within user-facing applications that can help news consumers navigate contexts and hierarchies associated with news stories interactively, engendering a novel class of news aggregator design. The aggregation model presented in this work incorporates state-of-the-art semantic technologies to implement a topological organization of related news stories, allowing for guided exploration of interconnected events and associated temporal dynamics. A novel class of news aggregator design enabled by the technology developed in this work can help improve user experience of news consumption and create a positive impact in terms of digital social health by rethinking news discovery mechanics and optimizing for context navigation and exploration rather than engagement.
Metrics
86 File views/ downloads
51 Record Views
Details
- Title
- Modeling emerging news stories across digital publications and social media
- Creators
- Munif Ishad Mujib
- Contributors
- Jake Williams (Advisor)
- Awarding Institution
- Drexel University
- Degree Awarded
- Doctor of Philosophy (Ph.D.)
- Publisher
- Drexel University; Philadelphia, Pennsylvania
- Number of pages
- xii, 128 pages
- Resource Type
- Dissertation
- Language
- English
- Academic Unit
- Information Science (Informatics) (2013-2026); College of Computing and Informatics (2013-2026); Drexel University
- Other Identifier
- 991017491293204721