CCE Theses and Dissertations

A Data Mining Framework for Improving Student Outcomes on Step 1 of the United States Medical Licensing Examination

James Clark, Nova Southeastern UniversityFollow

Date of Award

2019

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

College of Engineering and Computing

Advisor

Steven Terrell

Committee Member

Martha Snyder

Committee Member

Ling Wang

Keywords

data mining, machine learning, prediction model, Step 1, USMLE

Abstract

Identifying the factors associated with medical students who fail Step 1 of the United States Medical Licensing Examination (USMLE) has been a focus of investigation for many years. Some researchers believe lower scores on the Medical Colleges Admissions Test (MCAT) are the sole factor used to identify failure. Other researchers believe lower course outcomes during the first two years of medical training are better indicators of failure. Yet, there are medical students who fail Step 1 of the USMLE who enter medical school with high MCAT scores, and conversely medical students with lower academic credentials who are expected to have difficulty passing Step 1 but pass on the first attempt. Researchers have attempted to find the factors associated with Step 1 outcomes; however, there are two problems associated with their methods used. First is the small sample size due to the high national pass rate of Step 1. And second, research using multivariate regression models indicate correlates of Step 1 but does not predict individual student performance.

This study used data mining methods to create models which predict medical students at risk of failing Step 1 of the USMLE. Predictor variables include those available to admissions committees at application time, and final grades in courses taken during the preclinical years of medical education. Models were trained, tested, and validated using a stepwise approach, adding predictor variables in the order of courses taken to identify the point during the medical education continuum which best predicts students who will fail Step 1. Oversampling techniques were employed to resolve the problem of small sample sizes. Results of this study suggest at risk medical students can be identified as early as the end of the first term during the first year. The approach used in this study can serve as a framework which if implemented at other U.S. allopathic medical schools can identify students in time for appropriate interventions to impact Step 1 outcomes

NSUWorks Citation

James Clark. 2019. A Data Mining Framework for Improving Student Outcomes on Step 1 of the United States Medical Licensing Examination. Doctoral dissertation. Nova Southeastern University. Retrieved from NSUWorks, College of Engineering and Computing. (1070)
https://nsuworks.nova.edu/gscis_etd/1070.

CCE Theses and Dissertations

A Data Mining Framework for Improving Student Outcomes on Step 1 of the United States Medical Licensing Examination

Date of Award

Document Type

Degree Name

Department

Advisor

Committee Member

Committee Member

Keywords

Abstract

NSUWorks Citation

Included in

Browse

Author Corner

Links

Connect with NSU

CCE Theses and Dissertations

A Data Mining Framework for Improving Student Outcomes on Step 1 of the United States Medical Licensing Examination

Author

Date of Award

Document Type

Degree Name

Department

Advisor

Committee Member

Committee Member

Keywords

Abstract

NSUWorks Citation

Included in

Share

Browse

Author Corner

Links

Connect with NSU