Campus Access Only
All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.
Date of Award
Doctor of Philosophy (PhD)
College of Computing and Engineering
Artificial General Intelligence, Convolutional Neural Networks, Data Augmentation, Generative Adversarial Networks, Spatial Invariance, Stochastic Augmentation
Convolutional Neural Networks (CNNs) have achieved impressive results on complex visual tasks such as image recognition. They are commonly assumed to be spatially invariant to small transformations of their input images. Spatial invariance is a fundamental property that characterizes how a model reacts to input transformations, i.e., its generalizability - and deep networks that can robustly classify objects placed in different orientations or lighting conditions have the property of invariance. However, several authors have recently shown that this is not the case, and that slight rotations, translations, or rescaling of their input images significantly reduce the network’s predictive accuracy. Furthermore, incorrectly classified images can have disastrous consequences, such as fatalities from self-driving cars or raising concerns about racial discrimination.
Data augmentation is a mainstream technique used to improve invariance in CNNs by artificially increasing the amount of training data by generating new and diverse data points with additional properties extracted from the existing dataset. This research aimed to provide a rigorous comparative analysis of two novel data augmentation techniques used to improve spatial invariance and reduce overfitting in CNNs: RandAugment (a stochastic technique that applies geometric transforms to images) and Conditional Generative Adversarial Networks (generative models that create synthetic samples).
This work finetuned pre-trained ResNet50 and InceptionV3 networks on each augmentation method. It evaluated and compared combinations of these networks across a base model (NoelNet) developed for this work, using three benchmark image data sets taken from different perspectives: MNIST, FMNIST, and CIFAR-10 using the following metrics: training/validation loss, training/validation accuracy, test accuracy, precision, recall, and training latency. The experimental setup for each analysis was guided by five policies: Policy 1 (no data augmentation), Policy 2 (augmentation using RandAugment), Policy 3 (augmentation using GAN), Policy 4 (combine GAN generated samples with non-augmented samples), Policy 5 (apply RandAugment to GAN generated samples).
David Noel. 2023. An Investigation of Methods for Improving Spatial Invariance of Convolutional Neural Networks for Image Classification. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, College of Computing and Engineering. (1185)