CCE Theses and Dissertations

Date of Award


Document Type


Degree Name

Doctor of Philosophy (PhD)


College of Computing and Engineering


Wei Li

Committee Member

Ling Wang

Committee Member

Ajoy Kumar


event detection, machine learning, sentiment analysis, social network analysis, text mining


Currently there exists no clear-cut, commonly understood definition of what an event is in the context of Social Network Analysis (SNA). Events are commonly identified and measured with regards to repeated occurrences of related terms associated with a topic that gradually increase in frequency and then eventually decline. This ebb and flow of keyword frequencies occurs within a continuous stream of user messages in a social media platform such as Twitter. One disadvantage to this approach is that it tends to marginalize the human perspective of communication and event detection in favor of lexical trends. The goal of this study was to develop an alternate event detection technique and apply it to social media discussion venues such as Twitter. What was novel about our approach was that it incorporated the integration of two SNA metrics into a single metric called Newsworthiness. To test our method, we collected two 14-day datasets based on two different trending topics from current events. The first dataset was based on the keyword search “Tulsa+Rally.” The second dataset was based on the keywords “Atlanta+Protests.” Both datasets were graphed for their corresponding Newsworthiness and keyword frequency trajectories. The results of the two “Tulsa+Rally” graphs demonstrated that the Newsworthiness approach identified events that were undetectable to the keyword frequency approach. Results for the two “Atlanta+Protests” graphs were congruent in that they each identified the same three events. Our contribution to the body of research was threefold. First, we created a single metric called Newsworthiness by integrating Shannon Entropy and Diffusion Centrality. Second, we demonstrated the evaluative benefits of using quartiles to analyze Newsworthiness distributions for outliers and event peaks. Lastly, we demonstrated how to evaluate user activity by analyzing the Shannon Entropy and Diffusion Centrality of a discussion stream over the most efficient time period (p) metric. It has been empirically shown that the proposed metric, along with quartile-based analysis, provides a way to quantitatively identify events on social, political, and cybersecurity Twitter topics, and the performance is superior that of Keyword search. It was evident that the proposed metric has the potential to be applied to other topics and social platforms for event detection.