CCE Theses and Dissertations
Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
2013
Document Type
Dissertation - NSU Access Only
Degree Name
Doctor of Philosophy in Computer Information Systems (DCIS)
Department
Graduate School of Computer and Information Sciences
Advisor
Sumitra Mukherjee
Committee Member
Michael Laszlo
Committee Member
Francisco Mitropoulos
Keywords
constrained forest problem, genetic algorithm, NP-hard
Abstract
Given an undirected edge-weighted graph G and a positive integer m, the Constrained Forest Problem (CFP) seeks a lowest-cost (or minimum-weight) forest F which spans G while satisfying the requirement that each tree in F contain at least m vertices. This problem has been shown to be NP-hard for values of m greater than three, giving rise to a number of approximation strategies for finding reasonable m-forest solutions. This research presents a new genetic algorithm (GA) which can consistently find equal-or-better solutions to the problem when compared to non-genetic alternatives. This GA is unique in that it uses chromosomes which are actual candidate solutions (m-forests) and performs genetic operations (random creation, selection, recombination, and mutation) on these candidate solutions. Experiments were run using 180 different GA configurations on 50 benchmark graphs to determine which operators and techniques would be most successful in solving the m-forest problem. The "heaviest edge first" or HEF algorithm run against the minimum spanning tree (MST) of a graph was used as a performance metric. Previously, the HEF(MST) algorithm had been shown to produce the best results on m-forest problems. When the GA was able to find better results than HEF(MST) on the same problem instance, this was considered a GA success. Since the GA's initial population included heuristic candidate solutions such as HEF(MST), the GA never did worse than the best of these. GA solution quality did vary, however, often finding several different better-than-HEF(MST) solutions, illustrating the stochastic nature of the process. Based on data collected from the 9000 initial problem instances, several factors were shown to significantly improve the quality of the GA solution. Problem instances which did not include mutation had a much lower success rate than those which did. Adding calculated heuristic solutions such as HEF(MST) to the initial population allowed the GA to converge more quickly and improved its likelihood of finding better-than-HEF(MST) solutions. Building an initial population using randomly-generated candidate solutions whose edges were restricted to the problem graph's MST proved equally successful. GA configuration options were analyzed using all 9000 test cases and again using only those 403 cases in which the GA was able to find the very best solution for each graph. These analyses were consistent, and resulted in the identification of a single "best" GA configuration which combined the best overall initial population strategy, random seeding algorithms, mutation and crossover strategy. The selected configuration was then further tested using various values of m to ensure that the resulting GA could in fact find better-than-HEF(MST) solutions for the majority of problem instances.
NSUWorks Citation
John John Queern. 2013. An evolutionary algorithm for the constrained forest problem. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, Graduate School of Computer and Information Sciences. (278)
https://nsuworks.nova.edu/gscis_etd/278.