CEC Theses and Dissertations

Campus Access Only

All rights reserved. This publication is intended for use solely by faculty, students, and staff of Nova Southeastern University. No part of this publication may be reproduced, distributed, or transmitted in any form or by any means, now known or later developed, including but not limited to photocopying, recording, or other electronic or mechanical methods, without the prior written permission of the author or the publisher.

Date of Award

2010

Document Type

Dissertation - NSU Access Only

Degree Name

Doctor of Philosophy in Computer Information Systems (DCIS)

Department

Graduate School of Computer and Information Sciences

Advisor

Wei Li

Committee Member

Michael J. Lazlo

Committee Member

Wei Li

Committee Member

Leo Irakliotis

Abstract

The research investigated whether a Latent Semantic Analysis (LSA)-based approach to image retrieval can map pixel intensity into a smaller concept space with good accuracy and reasonable computational cost. From a large set of computed tomography (CT) images, a retrieval query found all images for a particular patient based on semantic similarity. The effectiveness of the LSA retrieval was evaluated based on precision, recall, and F-score.

This work extended the application of LSA to high-resolution CT radiology images. The images were chosen for their unique characteristics and their importance in medicine. Because CT images are intensity-only, they carry less information than color images. They typically have greater noise, higher intensity, greater contrast, and fewer colors than a raw RGB image. The study targeted level of intensity for image features extraction.

The focus of this work was a formal evaluation of the LSA method in the context of large number of high-resolution radiology images. The study reported on preprocessing and retrieval time and discussed how reduction of the feature set size affected the results. LSA is an information retrieval technique that is based on the vector-space model. It works by reducing the dimensionality of the vector space, bringing similar terms and documents closer together. Matlab software was used to report on retrieval and preprocessing time.

In determining the minimum size of concept space, it was found that the best combination of precision, recall, and F-score was achieved with 250 concepts (k = 250). This research reported precision of 100% on 100% of the queries and recall close to 90% on 100% of the queries with k=250. Selecting a higher number of concepts did not improve recall and resulted in significantly increased computational cost.

To access this thesis/dissertation you must have a valid nova.edu OR mynsu.nova.edu email address and create an account for NSUWorks.

  Contact Author

  Link to NovaCat

Share

COinS