CCE Theses and Dissertations
Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
2009
Document Type
Dissertation - NSU Access Only
Degree Name
Doctor of Philosophy in Computer Information Systems (DCIS)
Department
Graduate School of Computer and Information Sciences
Advisor
Sumitra Mukherjee
Committee Member
William L Hafner
Committee Member
Michael J. Lazlo
Keywords
Concept vector, Hierarchy, Ontology, Reuters-21578, Support Vector Machine, Text classification
Abstract
As the quantity of text documents created on the web grows the ability of experts to manually classify them has decreased. Because people need to find and organize this information, interest has grown in developing automatic means of categorizing these documents. In this effort, ontologies have been developed that capture domain specific knowledge in the form of a hierarchy of concepts.
Support Vector Machines are machine learning methods that are widely used for automated document categorization. Recent studies suggest that the classification accuracy of a Support Vector Machine may be improved by using concepts defined by a domain ontology instead of using the words that appear in the document. However, such studies have not taken into account the hierarchy inherent in the relationship between concepts. The goal of this dissertation was to investigate whether the hierarchical relationships among concepts in ontologies can be exploited to improve the classification accuracy of web documents by a Support Vector Machine.
Concept vectors that capture the hierarchy of domain ontologies were created and used to train a Support Vector Machine. Tests conducted using the benchmark Reuters-21578 data set indicate that the Support Vector Machines achieve higher classification accuracy when they make use of the hierarchical relationships among concepts in ontologies.
NSUWorks Citation
Jeffrey A. Graham. 2009. Effect of ontology hierarchy on a concept vector machine's ability to classify web documents. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (165)
https://nsuworks.nova.edu/gscis_etd/165.