Agreement of ChatGPT with Clinical Practice Guidelines for Knee Osteoarthritis Treatment

Abstract

Purpose: This cross-sectional study evaluated whether Chat Generative Pre-Trained Transformer (ChatGPT) GPT-3.5 and GPT-4O provide treatment information consistent with high-quality clinical practice guidelines (CPGs) for knee osteoarthritis (OA). Methods: High-quality CPGs published in the past decade were identified via PubMed and PEDro, with the search updated on November 11, 2024. Guidelines were appraised using the AGREE II tool. GPT-3.5 and GPT-4O were queried with common treatment-related questions, and their responses were compared to CPG recommendations. Two independent reviewers conducted a thematic content analysis of GPT-3.5 and GPT-4O responses using a deductive–inductive codebook, iteratively refined through consensus, to identify major themes/subthemes, and their frequencies. Consistency between ChatGPT outputs and CPGs was categorized as high agreement (4/4 “yes” responses), moderate (3/4 “yes” responses), or low agreement (2/4 or fewer “yes” responses or 2/4 not reported). Study selection and data extraction were performed independently by two reviewers, with a third reviewer consulted when necessary. Inter-reviewer agreement for data extraction, assessing alignment between ChatGPT and CPGs, was evaluated using the percentage agreement to measure consistency in data identification and categorization. Results: Four high-quality guidelines were identified and analyzed. GPT-3.5 and GPT-4o generated 14 and 10 questions, respectively, yielding 10 themes and 33 subthemes. The 10 themes included: exercise and physical therapy, lifestyle modifications, medications, supplements and herbal remedies, assistive devices, additional therapies, education, consultation, intraarticular injections, and surgical and advanced treatments. Among the subthemes, 6.06% demonstrated high agreement (e.g., low-impact aerobic exercises, patient education), 18.18% moderate agreement (e.g., strengthening, weight management), and 75.75% low agreement across interventions. Excluding the physical therapy-specific CPG increased agreement for treatments such as analgesics, NSAIDs, glucosamine and chondroitin, and intra-articular corticosteroid injections. Conclusion: While GPT-3.5 and GPT-4O demonstrated high or moderate agreement with CPGs for certain themes, most subthemes showed low agreement. A separate analysis excluding a CPG developed specifically for physical therapists modestly improved agreement levels but did not alter the overall pattern. These findings highlight the need for ongoing refinement of AI tools and underscore the importance of clinicians critically evaluating AI-generated content, particularly as patients may increasingly rely on such tools to guide self-management decisions.

Author Bio(s)

Alessandra Trepte, PT, PhD, is an Assistant Professor in the Department of Physical Therapy at Campbell University. Her teaching focuses on research methodology, the International Classification of Functioning model, and social determinants of health. Her research focuses on musculoskeletal pain, health outcomes, and patient-reported measures.

Brian Neville, PT, DPT, PhD, is an Assistant Professor in the Department of Physical Therapy at Campbell University. He teaches biomechanics, exercise physiology, clinical reasoning, and pharmacology. His clinical and research interests include human performance, movement theory, precision rehabilitation, and regenerative medicine.

Brad Myers, PT, DPT, DSc, is Chair and Director of the Doctor of Physical Therapy Program at Campbell University. He specializes in orthopaedic manual therapy and clinical reasoning, with research interests in manual therapy, motor control, and exercise interventions for musculoskeletal dysfunction. He is a Fellow of AAOMPT and board certified in orthopaedics.

Lori Leineke, PT, DPT, EdD, is an Associate Professor in the Doctor of Physical Therapy Program at Campbell University. Board certified in orthopaedics, she brings over 25 years of clinical experience and serves as a reviewer for CAPTE and ABPTRFE. Her research focuses on sport performance and lower leg muscle activation.

Karlyn Green, PT, DPT, is an Assistant Professor in the Doctor of Physical Therapy Program at Campbell University. She is board certified in both cardiovascular/pulmonary and orthopaedic physical therapy. Her research interests focus on the intersection of cardiopulmonary and musculoskeletal care, including work on restrictive lung disease, COVID-19, and heart failure.

Recommended Citation

Garcia AN, Neville B, Myers B, Leineke L, Green K. Agreement of ChatGPT with Clinical Practice Guidelines for Knee Osteoarthritis Treatment. The Internet Journal of Allied Health Sciences and Practice. 2026 Mar 03;24(1), Article 16.

Download

Included in

Education Commons, Health Information Technology Commons, Physical Therapy Commons

Submission Location

COinS

Agreement of ChatGPT with Clinical Practice Guidelines for Knee Osteoarthritis Treatment

Authors

Abstract

Author Bio(s)

Recommended Citation

Included in

Share

Submission Location

Submission Locations