CEC Theses and Dissertations

Date of Award


Document Type


Degree Name

Doctor of Philosophy in Computer Science (CISD)


College of Engineering and Computing


Francisco J. Mitropoulos

Committee Member

Sumitra Mukherjee

Committee Member

Renata Rand McFadden


In legacy software, non-functional concerns tend to cut across the system and manifest themselves as tangled or scattered code. If these crosscutting concerns could be modularized and the system refactored, then the system would become easier to understand, modify, and maintain. Modularized crosscutting concerns are known as aspects and the process of identifying aspect candidates in legacy software is called aspect mining.

One of the techniques used in aspect mining is clustering and there are many clustering algorithms. Current aspect mining clustering algorithms attempt to form clusters by optimizing one objective function. However, the objective function to be optimized tends to bias the formation of clusters towards the data model implicitly defined by that function. One solution is to use algorithms that try to optimize more than one objective function. These multiobjective algorithms have been used successfully in data mining but, as far as this author knows, have not been applied to aspect mining.

This study investigated the feasibility of using multiobjective evolutionary algorithms, in particular, multiobjective genetic algorithms, in aspect mining. The study utilized an existing multiobjective genetic algorithm, MOCK, which had already been tested against several popular single objective clustering algorithms. MOCK has been shown to be, on average, as good as, and sometimes better than, those algorithms. Since some of those data mining algorithms have counterparts in aspect mining, it was reasonable to assume that MOCK would perform at least as good in an aspect mining context.

Since MOCK's objective functions were not directly trying to optimize aspect mining metrics, the study also implemented another multiobjective genetic algorithm, AMMOC, based on MOCK but tailored to optimize those metrics. The reasoning hinged on the fact that, since the goal was to determine if a clustering method resulted in optimizing these quality metrics, it made sense to attempt to optimize these functions directly instead of a posteriori.

This study determined that these multiobjective algorithms performed at least as good as two popular aspect mining algorithms, k-means and hierarchical agglomerative. As a result, this study has contributed to both the theoretical body of knowledge in the field of aspect mining as well as provide a practical tool for the field.