CCE Theses and Dissertations
Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
2011
Document Type
Dissertation - NSU Access Only
Degree Name
Doctor of Philosophy in Computer Science (CISD)
Department
Graduate School of Computer and Information Sciences
Advisor
Junping Sun
Committee Member
Sumitra Mukherjee
Committee Member
Francisco Mitropoulos
Keywords
aspect, clustering, concern, crosscutting, mining
Abstract
A legacy software system can be taken to consist of N methods which contain within their implementations the intended activities and functions of the system. These activities and functions are referred to as concerns. Some of these concerns are typically implemented and used in multiple methods throughout the system and these are deemed to be crosscutting concerns. Through the use of an aspect-oriented programming paradigm, the implementation and use of these crosscutting concerns can be abstracted into aspects. In order to refactor the system, the process of aspect mining is carried out to identify the crosscutting concerns in the software system. Once identified, the crosscutting concerns can then be refactored into aspects.
Clustering-based aspect mining techniques make use of a vector space model to represent the source code to be mined. In this investigation, the individual methods of the software system were represented by a d-dimensional vector by mapping a method M to the vector V where the components of the vector V were values derived from applying a source code metric to each method M. These vector space models were then processed through the k-means++ clustering algorithm and the resulting cluster configurations were then evaluated to assess the quality of the results with respect to the identification of crosscutting concerns.
This research studied the effect that the number of dimensions of a vector space model has
on the results of a clustering-based aspect mining algorithm. Several vector space models
were defined and principal component analysis was used to reduce the dimensionality of the models. Each of the models was processed multiple times through the aspect mining algorithm and the distributions of the collected measures were tested for statistically significant differences using the Wilcoxon rank sum test. The results indicate that changes in the number of dimensions of a vector space model can produce significant effects in the collected measures. In addition, the measures used to assess the performance of an aspect mining process need to be analyzed for underlying relationships.
NSUWorks Citation
William Tribbey. 2011. Construction and Analysis of Vector Space Models for Use in Aspect Mining. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (326)
https://nsuworks.nova.edu/gscis_etd/326.