CCE Theses and Dissertations
Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
2014
Document Type
Dissertation - NSU Access Only
Degree Name
Doctor of Philosophy in Computer Science (CISD)
Department
Graduate School of Computer and Information Sciences
Advisor
Wei Li
Committee Member
James D Cannady
Committee Member
Sumitra Mukherjee
Keywords
Bayesian Belief Networks, Interestingness based Bayesian Outlier eXplainer, Interestingness Measures, Machine Learning, Outlier Analysis, Probabilistic Graphical Models
Abstract
This research explores the potential of improving the explainability of outliers using Bayesian Belief Networks as background knowledge. Outliers are deviations from the usual trends of data. Mining outliers may help discover potential anomalies and fraudulent activities. Meaningful outliers can be retrieved and analyzed by using domain knowledge. Domain knowledge (or background knowledge) is represented using probabilistic graphical models such as Bayesian belief networks. Bayesian networks are graph-based representation used to model and encode mutual relationships between entities. Due to their probabilistic graphical nature, Belief Networks are an ideal way to capture the sensitivity, causal inference, uncertainty and background knowledge in real world data sets. Bayesian Networks effectively present the causal relationships between different entities (nodes) using conditional probability. This probabilistic relationship shows the degree of belief between entities. A quantitative measure which computes changes in this degree of belief acts as a sensitivity measure .
The first contribution of this research is enhancing the performance for measurement of sensitivity based on earlier research work, the Interestingness Filtering Engine Miner algorithm. The algorithm developed (IBOX - Interestingness based Bayesian outlier eXplainer) provides progressive improvement in the performance and sensitivity scoring of earlier works. Earlier approaches compute sensitivity by measuring divergence among conditional probability of training and test data, while using only couple of probabilistic interestingness measures such as Mutual information and Support to calculate belief sensitivity. With ingrained support from the literature as well as quantitative evidence, IBOX provides a framework to use multiple interestingness measures resulting in better performance and improved sensitivity analysis. The results provide improved performance, and therefore explainability of rare class entities. This research quantitatively validated probabilistic interestingness measures as an effective sensitivity analysis technique in rare class mining. This results in a novel, original, and progressive research contribution to the areas of probabilistic graphical models and outlier analysis.
NSUWorks Citation
Adnan Masood. 2014. Measuring Interestingness in Outliers with Explanation Facility using Belief Networks. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (232)
https://nsuworks.nova.edu/gscis_etd/232.