Date of Award
Doctor of Philosophy in Computer Science (CISD)
College of Engineering and Computing
Michael J. Lazlo
Amon B. Seagull
Measures of classical rhetorical structure in text can improve accuracy in certain types of stylistic classification tasks such as authorship attribution. This research augments the relatively scarce work in the automated identification of rhetorical figures and uses the resulting statistics to characterize an author's rhetorical style. These characterizations of style can then become part of the feature set of various classification models.
Our Rhetorica software identifies 14 classical rhetorical figures in free English text, with generally good precision and recall, and provides summary measures to use in descriptive or classification tasks. Classification models trained on Rhetorica's rhetorical measures paired with lexical features typically performed better at authorship attribution than either set of features used individually. The rhetorical measures also provide new stylistic quantities for describing texts, authors, genres, etc.
James Java. 2015. Characterization of Prose by Rhetorical Structure for Machine Learning Classification. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, College of Engineering and Computing. (347)