ChatGPT’s adherence to otolaryngology clinical practice guidelines

Idit Tessler ORCID: orcid.org/0000-0002-3297-7022^1,2,3^na1,
Amit Wolfovitz^1,2^na1,
Eran E. Alon^1,2,
Nir A. Gecel²,
Nir Livneh^1,2,
Eyal Zimlichman^2,3,4,5 &
…
Eyal Klang⁶

195 Accesses
Explore all metrics

Abstract

Objectives

Large language models, including ChatGPT, has the potential to transform the way we approach medical knowledge, yet accuracy in clinical topics is critical. Here we assessed ChatGPT’s performance in adhering to the American Academy of Otolaryngology-Head and Neck Surgery guidelines.

Methods

We presented ChatGPT with 24 clinical otolaryngology questions based on the guidelines of the American Academy of Otolaryngology. This was done three times (N = 72) to test the model’s consistency. Two otolaryngologists evaluated the responses for accuracy and relevance to the guidelines. Cohen’s Kappa was used to measure evaluator agreement, and Cronbach’s alpha assessed the consistency of ChatGPT’s responses.

Results

The study revealed mixed results; 59.7% (43/72) of ChatGPT’s responses were highly accurate, while only 2.8% (2/72) directly contradicted the guidelines. The model showed 100% accuracy in Head and Neck, but lower accuracy in Rhinology and Otology/Neurotology (66%), Laryngology (50%), and Pediatrics (8%). The model’s responses were consistent in 17/24 (70.8%), with a Cronbach’s alpha value of 0.87, indicating a reasonable consistency across tests.

Conclusions

Using a guideline-based set of structured questions, ChatGPT demonstrates consistency but variable accuracy in otolaryngology. Its lower performance in some areas, especially Pediatrics, suggests that further rigorous evaluation is needed before considering real-world clinical use.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Data availability

Data supporting this study are included within the article and further data will be available upon reasonable request.

References

Pavlik JV (2023) Collaborating with ChatGPT: considering the implications of generative artificial intelligence for journalism and media education. Journalism Mass Commun Educ. https://doi.org/10.1177/10776958221149577
Article Google Scholar
Johnson SB, King AJ, Warner EL, Aneja S, Kann BH, Bylund CL (2023) Using ChatGPT to evaluate cancer myths and misconceptions: artificial intelligence and cancer information. JNCI Cancer Spectr 7(2):pkad015
Article PubMed PubMed Central Google Scholar
Kung TH, Cheatham M, Medinilla A, ChatGPT, Sillos C, De Leon L, Elepaño C, Madriaga M, Aggabao R, Diaz-Candido G, Maningo J, Tseng V (2023) Performance of ChatGPT on USMLE: potential for AI-assisted medical education using large language models. PLOS Digit Health 2(2):e0000198. https://doi.org/10.1371/journal.pdig.0000198
Teixeira-Marques F, Medeiros N, Nazaré F, Alves S, Lima N, Ribeiro L et al (2024) Exploring the role of ChatGPT in clinical decision-making in otorhinolaryngology: a ChatGPT designed study. Eur Arch Otorhinolaryngol 281:2023–2030
Article PubMed Google Scholar
Juhi A, Pipil N, Santra S, Mondal S, Behera JK, Mondal H (2023) The capability of ChatGPT in predicting and explaining common drug–drug interactions. Cureus 15(3):e36272
PubMed PubMed Central Google Scholar
Mittermaier M, Raza MM, Kvedar JC (2023) Bias in AI-based models for medical applications: challenges and mitigation strategies. npj Digit Med 6(1):113
Article PubMed PubMed Central Google Scholar
Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453
Article CAS PubMed Google Scholar
Wang C, Liu S, Yang H, Guo J, Wu Y, Liu J (2023) Ethical considerations of using ChatGPT in health care. J Med Internet Res 25:e48009
Article PubMed PubMed Central Google Scholar
American Academy of Otolaryngology-Head and Neck Surgery (AAO-HNS). https://www.entnet.org/. Accessed 31 Jan 2023
Zalzal HG, Cheng J, Shah RK (2023) Evaluating the current ability of ChatGPT to assist in professional otolaryngology education. OTO Open 7(4):e94
Article PubMed PubMed Central Google Scholar
Graham F (2022) Daily briefing: will ChatGPT kill the essay assignment? Nature. https://doi.org/10.1038/d41586-022-04437-2
Article PubMed PubMed Central Google Scholar
O’Connor S (2023) Open artificial intelligence platforms in nursing education: tools for academic progress or abuse? Nurse Educ Pract 66:103537
Article PubMed Google Scholar
Castelvecchi D (2022) Are ChatGPT and AlphaCode going to replace programmers? Nature. https://doi.org/10.1038/d41586-022-04383-z
Article PubMed Google Scholar
Thorp HH (2023) ChatGPT is fun, but not an author. Science 379(6630):313
Article PubMed Google Scholar
Alfertshofer M, Hoch CC, Funk PF, Hollmann K, Wollenberg B, Knoedler S et al (2023) Sailing the seven seas: a multinational comparison of ChatGPT’s performance on medical licensing examinations. Ann Biomed Eng. https://doi.org/10.1007/s10439-023-03338-3
Article PubMed PubMed Central Google Scholar
Vaira LA, Lechien JR, Abbate V, Allevi F, Audino G, Beltramini GA et al (2023) Accuracy of ChatGPT-generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis. Otolaryngol Head Neck Surg. https://doi.org/10.1002/ohn.489
Article PubMed Google Scholar
Taira K, Itaya T, Hanada A (2023) Performance of the large language model ChatGPT on the national nurse examinations in Japan: evaluation study. JMIR Nurs 6:e47305
Article PubMed PubMed Central Google Scholar
Qu RW, Qureshi U, Petersen G, Lee SC (2023) Diagnostic and management applications of ChatGPT in structured otolaryngology clinical scenarios. OTO Open 7(3):e67
Article PubMed PubMed Central Google Scholar
Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA et al (2023) How does ChatGPT perform on the United States medical licensing examination (USMLE)? The implications of large language models for medical education and knowledge assessment. JMIR Med Educ 9:e45312
Article PubMed PubMed Central Google Scholar
Thirunavukarasu AJ, Hassan R, Mahmood S, Sanghera R, Barzangi K, El Mukashfi M et al (2023) Trialling a large language model (ChatGPT) in general practice with the applied knowledge test: observational study demonstrating opportunities and limitations in primary care. JMIR Med Educ 9:e46599
Article PubMed PubMed Central Google Scholar
Tessler I, Wolfovitz A, Livneh N, Gecel NA, Sorin V, Barash Y et al (2024) Advancing medical practice with artificial intelligence: ChatGPT in healthcare. Isr Med Assoc J 26(2):80–85
PubMed Google Scholar
Hoch CC, Wollenberg B, Lüers J-C, Knoedler S, Knoedler L, Frank K et al (2023) ChatGPT’s quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions. Eur Arch Otorhinolaryngol 280(9):4271–4278
Article PubMed PubMed Central Google Scholar
Guirguis CA, Crossley JR, Malekzadeh S (2023) Bilateral vocal fold paralysis in a patient with neurosarcoidosis: a ChatGPT-driven case report describing an unusual presentation. Cureus 15(4):e37368
PubMed PubMed Central Google Scholar
Kim H-Y (2023) A case report on ground-level alternobaric vertigo due to eustachian tube dysfunction with the assistance of conversational generative pre-trained transformer (ChatGPT). Cureus 15(3):e36830
PubMed PubMed Central Google Scholar
Radulesco T, Saibene AM, Michel J, Vaira LA, Lechien JR (2024) ChatGPT-4 performance in rhinology: a clinical case series. Int Forum Allergy Rhinol. https://doi.org/10.1002/alr.23323
Article PubMed Google Scholar

Download references

Funding

None.

Author information

Idit Tessler and Amit Wolfovitz have contributed equally to this work.

Authors and Affiliations

Department of Otolaryngology and Head and Neck Surgery, Sheba Medical Center, Ramat Gan, Israel
Idit Tessler, Amit Wolfovitz, Eran E. Alon & Nir Livneh
School of Medicine, Tel Aviv University, Tel Aviv, Israel
Idit Tessler, Amit Wolfovitz, Eran E. Alon, Nir A. Gecel, Nir Livneh & Eyal Zimlichman
ARC Innovation Center, Sheba Medical Center, Ramat Gan, Israel
Idit Tessler & Eyal Zimlichman
The Sheba Talpiot Medical Leadership Program, Ramat Gan, Israel
Eyal Zimlichman
Hospital Management, Sheba Medical Center, Ramat Gan, Israel
Eyal Zimlichman
The Division of Data-Driven and Digital Medicine (D3M), Icahn School of Medicine at Mount Sinai, New York, USA
Eyal Klang

Authors

Idit Tessler
View author publications
You can also search for this author in PubMed Google Scholar
Amit Wolfovitz
View author publications
You can also search for this author in PubMed Google Scholar
Eran E. Alon
View author publications
You can also search for this author in PubMed Google Scholar
Nir A. Gecel
View author publications
You can also search for this author in PubMed Google Scholar
Nir Livneh
View author publications
You can also search for this author in PubMed Google Scholar
Eyal Zimlichman
View author publications
You can also search for this author in PubMed Google Scholar
Eyal Klang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Idit Tessler.

Ethics declarations

Conflict of interest

All authors declare no conflict of interest in connection with this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (DOCX 47 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tessler, I., Wolfovitz, A., Alon, E.E. et al. ChatGPT’s adherence to otolaryngology clinical practice guidelines. Eur Arch Otorhinolaryngol 281, 3829–3834 (2024). https://doi.org/10.1007/s00405-024-08634-9

Download citation

Received: 01 January 2024
Accepted: 22 March 2024
Published: 22 April 2024
Issue Date: July 2024
DOI: https://doi.org/10.1007/s00405-024-08634-9

Abstract

Objectives

Methods

Results

Conclusions

Access this article

Subscribe and save

Buy Now

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (DOCX 47 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

ChatGPT’s adherence to otolaryngology clinical practice guidelines

Abstract

Objectives

Methods

Results

Conclusions

Access this article

Subscribe and save

Buy Now

Data availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (DOCX 47 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation