CEC Theses and Dissertations

Date of Award

2016

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Information Systems (DCIS)

Department

College of Engineering and Computing

Advisor

Sumitra Mukherjee

Committee Member

Michael Laszlo

Committee Member

Francisco J. Mitropoulos

Abstract

While traditional supervised learning focuses on static datasets, an increasing amount of data comes in the form of streams, where data is continuous and typically processed only once. A common problem with data streams is that the underlying concept we are trying to learn can be constantly evolving. This concept drift has been of interest to researchers the last few years and there is a need for improved machine learning algorithms that are capable of dealing with concept drifts. A promising approach involves using an ensemble of a diverse set of classifiers. The constituent classifiers are re-trained when a concept drift is detected. Decisions regarding the number of classifiers to maintain and the frequency of re-training classifiers are critical factors that determine classification accuracy in the presence of concept drift. This dissertation systematically investigated these issues in order to develop an improved classifier for online ensemble learning. The impact of reducing the time requiring additional ensembles was studied using artificial and real world datasets. Findings from these studies revealed that in many cases the number of time steps additional ensembles are in memory can be reduced without sacrificing prequential accuracy. It was also found that this new ensemble approach performed well in the presence of false concept drift.

Files over 10MB may be slow to open. For best results, right-click and select "Save as..."

Share Feedback

Share

COinS