CCE Theses and Dissertations
Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
2011
Document Type
Dissertation - NSU Access Only
Degree Name
Doctor of Philosophy in Computer Information Systems (DCIS)
Department
Graduate School of Computer and Information Sciences
Advisor
Sumitra Mukherjee
Committee Member
Michael J. Lazlo
Committee Member
Junping Sun
Abstract
Hard disk drives are used in everyday life to store critical data. Although they are reliable, failure of a hard disk drive can be catastrophic, especially in applications like medicine, banking, air traffic control systems, missile guidance systems, computer numerical controlled machines, and more. The use of Self-Monitoring, Analysis and Reporting Technology (SMART) can aid in failure prediction by monitoring specific drive attributes and warning the user of an impending failure so that the user can backup data while there is still time. As a consequence, hard drive failure prediction has become an important problem and the subject of active research.
The best available approaches for hard drive failure prediction achieve acceptably low false alarm rates by first selecting a subset of features using non-parametric statistical methods such as reverse arrangements and then using the multiple-instance naïve Bayes classifier for the prediction task. However, the prediction accuracy of this approach is not sufficiently high.
The focus of this dissertation was to improve the drive failure prediction accuracy while maintaining a low false alarm rate by using a genetic algorithm for feature set reduction in conjunction with the multiple-instance naïve Bayes classifier for the prediction task. This research achieved a failure detection rate of 81% with a 0% false alarm rate on 12 attributes selected by the genetic algorithm. As a secondary contribution, this dissertation investigated the tradeoff between feature subset reduction and prediction accuracy in the hard drive prediction problem. This research found that as the number of features decreased below 10, the detection accuracy decreased significantly.
NSUWorks Citation
Harpreet Bhasin. 2011. An Evolutionary Algorithm for Feature Subset Selection in Hard Disk Drive Failure Prediction. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (91)
https://nsuworks.nova.edu/gscis_etd/91.