CEC Theses and Dissertations

Date of Award

2016

Document Type

Dissertation

Degree Name

Doctor of Philosophy in Computer Science (CISD)

Department

College of Engineering and Computing

Advisor

Sumitra Mukherjee

Committee Member

Francisco J. Mitropoulos

Committee Member

Michael J. Lazlo

Abstract

Soft methods of artificial intelligence are often used in the prediction of non-deterministic time series that cannot be modeled using standard econometric methods. These series, such as occur in finance, often undergo changes to their underlying data generation process resulting in inaccurate approximations or requiring additional human judgment and input in the process, hindering the potential for automated solutions.

Genetic programming (GP) is a class of nature-inspired algorithms that aims to evolve a population of computer programs to solve a target problem. GP has been applied to time series prediction in finance and other domains. However, most GP-based approaches to these prediction problems do not consider regime change.

This paper introduces two new genetic programming modularity techniques, collectively referred to as automatically defined templates, which better enable prediction of time series involving regime change. These methods, based on earlier established GP modularity techniques, take inspiration from software design patterns and are more closely modeled after the way humans actually develop software. Specifically, a regime detection branch is incorporated into the GP paradigm. Regime specific behavior evolves in a separate program branch, implementing the template method pattern.

A system was developed to test, validate, and compare the proposed approach with earlier approaches to GP modularity. Prediction experiments were performed on synthetic time series and on the S&P 500 index. The performance of the proposed approach was evaluated by comparing prediction accuracy with existing methods.

One of the two techniques proposed is shown to significantly improve performance of time series prediction in series undergoing regime change. The second proposed technique did not show any improvement and performed generally worse than existing methods or the canonical approaches. The difference in relative performance was shown to be due to a decoupling of reusable modules from the evolving main program population. This observation also explains earlier results regarding the inferior performance of genetic programming techniques using a similar, decoupled approach. Applied to financial time series prediction, the proposed approach beat a buy and hold return on the S&P 500 index as well as the return achieved by other regime aware genetic programming methodologies. No approach tested beat the benchmark return when factoring in transaction costs.