CEC Theses and Dissertations

Date of Award

2016

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Information Systems (DISS)

Department

College of Engineering and Computing

Advisor

Sumitra Mukherjee

Committee Member

Michael J. Lazlo

Committee Member

Gregory E. Simco

Abstract

Operational deployment of machine learning based classifiers in real-world networks has become an important area of research to support automated real-time quality of service decisions by Internet service providers (ISPs) and more generally, network administrators. As the Internet has evolved, multimedia applications, such as voice over Internet protocol (VoIP), gaming, and video streaming, have become commonplace. These traffic types are sensitive to network perturbations, e.g. jitter and delay. Automated quality of service (QoS) capabilities offer a degree of relief by prioritizing network traffic without human intervention; however, they rely on the integration of real-time traffic classification to identify applications. Accordingly, researchers have begun to explore various techniques to incorporate into real-world networks. One method that shows promise is the use of machine learning techniques trained on sub-flows – a small number of consecutive packets selected from different phases of the full application flow. Generally, research on machine learning classifiers was based on statistics derived from full traffic flows, which can limit their effectiveness (recall and precision) if partial data captures are encountered by the classifier. In real-world networks, partial data captures can be caused by unscheduled restarts/reboots of the classifier or data capture capabilities, network interruptions, or application errors. Research on the use of machine learning algorithms trained on sub-flows to classify VoIP and gaming traffic has shown promise, even when partial data captures are encountered. This research extends that work by applying machine learning algorithms trained on multiple sub-flows to classification of video streaming traffic.

Results from this research indicate that sub-flow classifiers have much higher and more consistent recall and precision than full flow classifiers when applied to video traffic. Moreover, the application of ensemble methods, specifically Bagging and adaptive boosting (AdaBoost) further improves recall and precision for sub-flow classifiers. Findings indicate sub-flow classifiers based on AdaBoost in combination with the C4.5 algorithm exhibited the best performance with the most consistent results for classification of video streaming traffic.

Share

COinS