CCE Theses and Dissertations

Utilization of Semantic Linking and Visualization Techniques to Facilitate Knowledge Creation from Textual Software Artifacts

Date of Award


Document Type


Degree Name

Doctor of Philosophy in Computing Technology in Education (DCTE)


Graduate School of Computer and Information Sciences


Francisco J. Mitropoulos

Committee Member

Gregory E. Simco

Committee Member

Junping Sun


In this dissertation, the researcher investigated how to promote knowledge creation and sharing among software team members by automatically linking existing textual knowledge artifacts via a centralized framework based on information retrieval. Software maintenance and support are important day-to-day tasks in IT departments and they cannot be performed properly without complete understanding of the system layout.

During the project life cycle, different professionals might require different types of project knowledge. For instance, system management might be interested in overall system knowledge, while component developers most likely would express interest in more detailed and low level system knowledge. In this case, information sharing without taking into consideration levels of abstraction could be overwhelming.

Since overall knowledge about software projects is often distributed among different information sources such as source code, documentation and manuals, etc.; the researcher concluded that information is most often conveyed through the medium of text. Based on above conclusion, textual artifacts were the primary subjects of study during this research. The most important of them was the software source code because, due to programming language syntax, the application source code contains important structural and semantic information.

In this dissertation, the researcher concentrated on a centralized knowledge creation and sharing framework based on specialized information filtering techniques and information retrieval strategies. It helped to link existing textual artifacts and to facilitate knowledge discovery and sharing among team members. In order to construct semantic similarity links between different textual artifacts, the researcher used the LSI algorithm - a specialized information retrieval technique based on the vector space model and linear algebra. It was capable of providing a much cheaper and more flexible way to automate and to identify semantic links between textual artifacts based on their linguistic similarities.

The objectives of this research were met by developing a framework (and an experimental application based on it) that allowed one to link existing software artifacts and presented users with the concise list of the most relevant materials necessary to get answers about the desired topic. The framework employed different pre-processing techniques and visualized forms to discover and to present users with artifacts and their parts based on their semantic closeness and the desired sensitivity. The presented solution helped to recreate an indirect socialization stage and to promote explicit and tacit knowledge generation and transfer.

This document is currently not available here.

  Link to NovaCat