CCE Theses and Dissertations

Date of Award

2023

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

College of Computing and Engineering

Advisor

Francisco Mitropoulos

Committee Member

Michael Laszlo

Committee Member

Sumitra Mukherjee

Keywords

Code Completion, Machine Learning, Natural Language Processing, Neural Networks, Python Modules, Source Code Analysis

Abstract

Contemporary software development with modern programming languages leverages Integrated Development Environments, smart text editors, and similar tooling with code completion capabilities to increase the efficiency of software developers. Recent code completion research has shown that the combination of natural language processing with recurrent neural networks configured with long short-term memory can improve the accuracy of code completion predictions over prior models. It is well known that the accuracy of predictive systems based on training data is correlated to the quality and the quantity of the training data. This dissertation demonstrates that by expanding the training data set to include more references to specific Python third-party modules, the quality of the predictions increase for those specific Python third-party modules without degrading the quality of predictions of the originally represented modules.

Share

COinS