CCE Theses and Dissertations

Date of Award

2020

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

College of Computing and Engineering

Advisor

James D. Cannady

Committee Member

Wei Li

Committee Member

Peixiang Liu

Abstract

Real-world data streams often contain concept drift and noise. Additionally, it is often the case that due to their very nature, these real-world data streams also include temporal dependencies between data. Classifying data streams with one or more of these characteristics is exceptionally challenging. Classification of data within data streams is currently the primary focus of research efforts in many fields (i.e., intrusion detection, data mining, machine learning). Hierarchical Temporal Memory (HTM) is a type of sequence memory that exhibits some of the predictive and anomaly detection properties of the neocortex. HTM algorithms conduct training through exposure to a stream of sensory data and are thus suited for continuous online learning. This research developed an HTM sequence classifier aimed at classifying streaming data, which contained concept drift, noise, and temporal dependencies. The HTM sequence classifier was fed both artificial and real-world data streams and evaluated using the prequential evaluation method. Cost measures for accuracy, CPU-time, and RAM usage were calculated for each data stream and compared against a variety of modern classifiers (e.g., Accuracy Weighted Ensemble, Adaptive Random Forest, Dynamic Weighted Majority, Leverage Bagging, Online Boosting ensemble, and Very Fast Decision Tree). The HTM sequence classifier performed well when the data streams contained concept drift, noise, and temporal dependencies, but was not the most suitable classifier of those compared against when provided data streams did not include temporal dependencies. Finally, this research explored the suitability of the HTM sequence classifier for detecting stalling code within evasive malware. The results were promising as they showed the HTM sequence classifier capable of predicting coding sequences of an executable file by learning the sequence patterns of the x86 EFLAGs register. The HTM classifier plotted these predictions in a cardiogram-like graph for quick analysis by reverse engineers of malware. This research highlights the potential of HTM technology for application in online classification problems and the detection of evasive malware.

Share

COinS