-
PDF
- Split View
-
Views
-
Cite
Cite
Mehmet Fatih Şahin, Anil Keleş, Rıdvan Özcan, Çağrı Doğan, Erdem Can Topkaç, Murat Akgül, Cenk Murat Yazıci, Evaluation of information accuracy and clarity: ChatGPT responses to the most frequently asked questions about premature ejaculation, Sexual Medicine, Volume 12, Issue 3, June 2024, qfae036, https://doi.org/10.1093/sexmed/qfae036
- Share Icon Share
Abstract
Premature ejaculation (PE) is the most prevalent sexual dysfunction in men, and like many diseases and conditions, patients use Internet sources like ChatGPT, which is a popular artificial intelligence–based language model, for queries about this andrological disorder.
The objective of this research was to evaluate the quality, readability, and understanding of texts produced by ChatGPT in response to frequently requested inquiries on PE.
In this study we used Google Trends to identify the most frequently searched phrases related to PE. Subsequently, the discovered keywords were methodically entered into ChatGPT, and the resulting replies were assessed for quality using the Ensuring Quality Information for Patients (EQIP) program. The produced texts were assessed for readability using the Flesch–Kincaid Grade Level (FKGL), Flesch Reading Ease Score (FRES), and DISCERN metrics.
This investigation has identified substantial concerns about the quality of texts produced by ChatGPT, highlighting severe problems with reading and understanding.
The mean EQIP score for the texts was determined to be 45.93 ± 4.34, while the FRES was 15.8 ± 8.73. Additionally, the FKGL score was computed to be 15.68 ± 1.67 and the DISCERN score was 38.1 ± 3.78. The comparatively low average EQIP and DISCERN scores suggest that improvements are required to increase the quality and dependability of the presented information. In addition, the FKGL scores indicate a significant degree of linguistic intricacy, requiring a level of knowledge comparable to about 14 to 15 years of formal schooling in order to understand. The texts about treatment, which are the most frequently searched items, are more difficult to understand compared to other texts about other categories.
The results of this research suggest that compared to texts on other topics the PE texts produced by ChatGPT exhibit a higher degree of complexity, which exceeds the recommended reading threshold for effective health communication. Currently, ChatGPT is cannot be considered a substitute for comprehensive medical consultations.
This study is to our knowledge the first reported research investigating the quality and comprehensibility of information generated by ChatGPT in relation to frequently requested queries about PE. The main limitation is that the investigation included only the first 25 popular keywords in English.
ChatGPT is incapable of replacing the need for thorough medical consultations.
Introduction
Premature ejaculation (PE) is the predominant sexual disorder in male patients. Multiple studies have shown that the prevalence of PE in men ranges from 20% to 30%.1 Nevertheless, this disorder is often underreported due to the shame experienced by both patients and physicians, as well as a lack of knowledge, so PE may affect as many as 75% of males.2 There seem to be many underlying causes of PE, so a combination of different treatment approaches is needed to effectively address PE. Various therapeutic protocols have shown efficacy in prolonging the interval between the initiation of sexual activity and the occurrence of ejaculation. These therapy approaches include a wide range of interventions, ranging from behavioral adjustments and drugs to dietary changes and extensive surgical procedures.3
Nowadays, the Internet has emerged as a vital and indispensable reservoir of health-related knowledge. The accessibility of this information might greatly enhance the public’s medical knowledge and facilitate their rapid understanding of new advancements and treatments. However, its usefulness remains largely unrecognized.4 ChatGPT (OpenAI, United States) is a sophisticated Internet-based language model developed by OpenAI (United States) that uses deep learning techniques to produce replies that mimic patterns seen in human language.5 ChatGPT is currently one of the most extensive language models publicly available. It is able to grasp the nuances and intricacies of human language, consequently producing pertinent and contextually fitting replies for a diverse array of queries.6 ChatGPT serves as a great asset for medical students, physicians, nurses, and other health-care practitioners, enabling them to easily access the latest updates and breakthroughs in their specific areas of expertise.7 For patients, however, there are concerns that these models, like the Internet itself, might produce and spread medical errors.
In this study we sought to assess the quality and readability of replies given by ChatGPT to frequently asked queries about PE. Our goal was to answer this question: Are ChatGPT’s answers of sufficient quality and readability to inform patients about PE?
Materials and methods
This study, conducted on December 23, 2023, in the Tekirdag Namık Kemal University Urology Department, did not require any ethical committee approval because it contained only online information. For the same reason, there was no requirement for an IRB number or informed consent form. No patient data were used because this study is not a clinical study. Before the searches were performed, all personal browser data were deleted as a bias-preventative precaution. The Google Trends8 tool (https://trends.google.com/) (Google, United States) was used to determine the most frequently searched terms associated with PE.9 The search queries were gathered from worldwide searches beginning in 2004 and ending on December 23, 2023. The 25 most frequently sought terms were documented, covering a wide range of areas in Google’s online searches. Three keywords were excluded from the analysis because they were irrelevant to the topic or very short and incomplete: “erectile dysfunction,” “Viagra,” and “PE.” Subregions were used to categorize and document the geographic areas of interest.
The detected keywords were entered consecutively into the ChatGPT-4 version (https://chat.openai.com/), strictly according to the original search order. Prior to initiating the these searches, all browser-related data were again completely deleted to prevent bias. In addition, a new account was created just to engage with ChatGPT, enabling clear differentiation. Each inquiry was performed on a separate chat page to ensure segregation and simplify the analysis. The replies obtained were stored for the purpose of conducting evaluations on readability and quality.
The quality of information on treatment alternatives for health conditions in each passage was evaluated using the validated DISCERN questionnaire, the primary purpose of which is to allow information providers and patients to assess the quality of written information regarding treatment options. The DISCERN questionnaire is also aimed to promote the creation of high-quality, evidence-based consumer health information by establishing standards and serving as a reference for authors.10 The instrument has 15 questions that may be evaluated on a scale ranging from 1 to 5.
The Ensuring Quality Information for Patients (EQIP) tool was used to evaluate the quality of the acquired texts. This tool evaluates several aspects of the content, such as the coherence of information and the proficiency of the writing. The tool consists of 20 questions that may be answered either “yes,” “partly,” “no,” or “does not apply.” The score is calculated by multiplying the number of “yes” replies by 1, the number of “partly” replies by 0.5, and the number of “no” answers by zero. The obtained numbers are then added together, the total is divided by the number of items, which is 20, and the count of “does not apply” replies is subtracted from this result. Finally, the acquired value is multiplied by 100 to obtain the EQIP score, which is expressed as a percentage.11 To perform the research we used the ultimate averaged EQIP score to categorize each resource into distinct classifications. The categorization was determined by the score range and according to the suggestions specified in the original EQIP development document.11 Resources scoring between 76% and 100% were classified as “well written” and considered to be of excellent quality; those scoring between 51% and 75% were categorized as “good quality with minor issues,” those between 26% and 50% as “serious quality issues,” and between 0% and 25% as having “severe quality problems”.12
The EQIP tool has a wide range of applications outside of particular illnesses. It has been used to assess information sources in different domains and across various forms of information and is widely adaptable across several disciplines.13 Both the EQIP and DISCERN evaluation procedures were carried out by M.F.Ş. and A.K.; Ç.D. was included in cases of inconsistency. These experts were scientifically interested in PE and examined this patient group frequently. Because these experts were well versed regarding misleading information on the Internet, their competence in measuring the quality of ChatGPT was at a high level. Kappa statistics were used to measure the level of interrater dependability.
The readability of the retrieved texts was assessed using the Flesch–Kincaid Grade Level (FKGL) test and Flesch Reading Ease score (FRES) metrics. The FKGL algorithm calculates the grade level needed to understand the material by considering criteria such as sentence length and syllable count. A lower grade-level score indicates easier understanding, while a higher score implies greater linguistic complexity.14 The formula for the FKGL is as follows:
The FRES score is higher for a more readable text and lower for a more challenging text;15 a score below 30 denotes a reading level appropriate for college graduates.16 The formula for the FRES test is as follows:
The statistical analysis for this research was conducted using SPSS version 25 (IBM, New York, United States). Data normality was assessed using the Shapiro–Wilk test. Continuous data are represented using the mean value together with the standard deviation, while categorical data are conveyed using frequency. The Kruskal–Wallis test was used to evaluate group differences and means. The Spearman correlation coefficient was used to investigate any possible relationships. The P value was established at.05, accompanied by a CI of 95%.
Results
The top three keywords were “premature ejaculation treatment,” “cure premature ejaculation,” and “stop premature ejaculation.” A total of three keywords were eliminated (Table 1).
25 most significant keywords searched worldwide for PE from 2004 to 2023, as determined by Google Trends data.
Rank . | Keyword . | Relevance . | Classification of the topic according to EQIP . |
---|---|---|---|
1 | Premature ejaculation treatment | 100 | Discharge of aftercare |
2 | Cure premature ejaculation | 84 | Discharge of aftercare |
3 | Stop premature ejaculation | 72 | Discharge of aftercare |
4 | Premature ejaculation medicine | 68 | Discharge of aftercare |
5 | What is premature ejaculation | 58 | Condition or Illness |
6 | How to stop premature ejaculation | 50 | Discharge of aftercare |
7 | Premature ejaculation causes | 48 | Miscellaneous |
8 | Medicine for premature ejaculation | 46 | Discharge of aftercare |
9 | 41 | ||
10 | Premature ejaculation pills | 39 | Discharge of aftercare |
11 | Prevent premature ejaculation | 39 | Discharge of aftercare |
12 | Premature ejaculation help | 38 | Miscellaneous |
13 | Premature ejaculation cause | 37 | Miscellaneous |
14 | Premature ejaculation meaning | 36 | Condition or Illness |
15 | Ejaculation meaning | 35 | Condition or Illness |
16 | 34 | ||
17 | Treatment for premature ejaculation | 31 | Discharge of aftercare |
18 | How to cure premature ejaculation | 25 | Discharge of aftercare |
19 | Control premature ejaculation | 23 | Discharge of aftercare |
20 | How to prevent premature ejaculation | 23 | Discharge of aftercare |
21 | Premature ejaculation exercises | 21 | Discharge of aftercare |
22 | 21 | ||
23 | Cure for premature ejaculation | 21 | Discharge of aftercare |
24 | Premature ejaculation spray | 20 | Discharge of aftercare |
25 | Causes of premature ejaculation | 20 | Miscellaneous |
Rank . | Keyword . | Relevance . | Classification of the topic according to EQIP . |
---|---|---|---|
1 | Premature ejaculation treatment | 100 | Discharge of aftercare |
2 | Cure premature ejaculation | 84 | Discharge of aftercare |
3 | Stop premature ejaculation | 72 | Discharge of aftercare |
4 | Premature ejaculation medicine | 68 | Discharge of aftercare |
5 | What is premature ejaculation | 58 | Condition or Illness |
6 | How to stop premature ejaculation | 50 | Discharge of aftercare |
7 | Premature ejaculation causes | 48 | Miscellaneous |
8 | Medicine for premature ejaculation | 46 | Discharge of aftercare |
9 | 41 | ||
10 | Premature ejaculation pills | 39 | Discharge of aftercare |
11 | Prevent premature ejaculation | 39 | Discharge of aftercare |
12 | Premature ejaculation help | 38 | Miscellaneous |
13 | Premature ejaculation cause | 37 | Miscellaneous |
14 | Premature ejaculation meaning | 36 | Condition or Illness |
15 | Ejaculation meaning | 35 | Condition or Illness |
16 | 34 | ||
17 | Treatment for premature ejaculation | 31 | Discharge of aftercare |
18 | How to cure premature ejaculation | 25 | Discharge of aftercare |
19 | Control premature ejaculation | 23 | Discharge of aftercare |
20 | How to prevent premature ejaculation | 23 | Discharge of aftercare |
21 | Premature ejaculation exercises | 21 | Discharge of aftercare |
22 | 21 | ||
23 | Cure for premature ejaculation | 21 | Discharge of aftercare |
24 | Premature ejaculation spray | 20 | Discharge of aftercare |
25 | Causes of premature ejaculation | 20 | Miscellaneous |
Abbreviation: EQIP, Ensuring Quality Information for Patients.
25 most significant keywords searched worldwide for PE from 2004 to 2023, as determined by Google Trends data.
Rank . | Keyword . | Relevance . | Classification of the topic according to EQIP . |
---|---|---|---|
1 | Premature ejaculation treatment | 100 | Discharge of aftercare |
2 | Cure premature ejaculation | 84 | Discharge of aftercare |
3 | Stop premature ejaculation | 72 | Discharge of aftercare |
4 | Premature ejaculation medicine | 68 | Discharge of aftercare |
5 | What is premature ejaculation | 58 | Condition or Illness |
6 | How to stop premature ejaculation | 50 | Discharge of aftercare |
7 | Premature ejaculation causes | 48 | Miscellaneous |
8 | Medicine for premature ejaculation | 46 | Discharge of aftercare |
9 | 41 | ||
10 | Premature ejaculation pills | 39 | Discharge of aftercare |
11 | Prevent premature ejaculation | 39 | Discharge of aftercare |
12 | Premature ejaculation help | 38 | Miscellaneous |
13 | Premature ejaculation cause | 37 | Miscellaneous |
14 | Premature ejaculation meaning | 36 | Condition or Illness |
15 | Ejaculation meaning | 35 | Condition or Illness |
16 | 34 | ||
17 | Treatment for premature ejaculation | 31 | Discharge of aftercare |
18 | How to cure premature ejaculation | 25 | Discharge of aftercare |
19 | Control premature ejaculation | 23 | Discharge of aftercare |
20 | How to prevent premature ejaculation | 23 | Discharge of aftercare |
21 | Premature ejaculation exercises | 21 | Discharge of aftercare |
22 | 21 | ||
23 | Cure for premature ejaculation | 21 | Discharge of aftercare |
24 | Premature ejaculation spray | 20 | Discharge of aftercare |
25 | Causes of premature ejaculation | 20 | Miscellaneous |
Rank . | Keyword . | Relevance . | Classification of the topic according to EQIP . |
---|---|---|---|
1 | Premature ejaculation treatment | 100 | Discharge of aftercare |
2 | Cure premature ejaculation | 84 | Discharge of aftercare |
3 | Stop premature ejaculation | 72 | Discharge of aftercare |
4 | Premature ejaculation medicine | 68 | Discharge of aftercare |
5 | What is premature ejaculation | 58 | Condition or Illness |
6 | How to stop premature ejaculation | 50 | Discharge of aftercare |
7 | Premature ejaculation causes | 48 | Miscellaneous |
8 | Medicine for premature ejaculation | 46 | Discharge of aftercare |
9 | 41 | ||
10 | Premature ejaculation pills | 39 | Discharge of aftercare |
11 | Prevent premature ejaculation | 39 | Discharge of aftercare |
12 | Premature ejaculation help | 38 | Miscellaneous |
13 | Premature ejaculation cause | 37 | Miscellaneous |
14 | Premature ejaculation meaning | 36 | Condition or Illness |
15 | Ejaculation meaning | 35 | Condition or Illness |
16 | 34 | ||
17 | Treatment for premature ejaculation | 31 | Discharge of aftercare |
18 | How to cure premature ejaculation | 25 | Discharge of aftercare |
19 | Control premature ejaculation | 23 | Discharge of aftercare |
20 | How to prevent premature ejaculation | 23 | Discharge of aftercare |
21 | Premature ejaculation exercises | 21 | Discharge of aftercare |
22 | 21 | ||
23 | Cure for premature ejaculation | 21 | Discharge of aftercare |
24 | Premature ejaculation spray | 20 | Discharge of aftercare |
25 | Causes of premature ejaculation | 20 | Miscellaneous |
Abbreviation: EQIP, Ensuring Quality Information for Patients.
Ghana, with a Search Interest Score (SIS) of 100, Somalia (SIS: 52), and Kenya (SIS: 51) ranked as the three nations with the most search interest in PE. Figure 1 displays the search interest in PE across various nations.
![The global search interest by region for premature ejaculation from 2004 to 2023, from Google trends data (low search volume regions included).](https://cdn.statically.io/img/oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/smoa/12/3/10.1093_sexmed_qfae036/1/m_qfae036f1.jpeg?Expires=1724368167&Signature=JxBwsDbHLP3bm9LW~WWtvAdqBSfkk3hFqSfZW0Kld9mstKVf4B4Kqii9GlfpOV3IkDVvtcyif12nB1WlqgvQ4Cy9zsw6Y9uZbsATpdwfJZxvyYBos78nMth-JGj7TG~EeDk6SqASQbD5E14MfpeYiFt6Kh25EESO3d3~hb44UG8EDv~7WAMf71b~j66pt0RROKg1cKIiXgToyyDEtPDZdsK7-45Cx21qeZjn~E2M9nPCefj33WOuRaysNN-jsfJB3wMJVxdf4~O6OuY05tPtiamp7QdviVp79eSiYvTa5V53SmanK5o1WKNLCwc-A1YY2mOUxba4nUHIdOclNBcK2g__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
The global search interest by region for premature ejaculation from 2004 to 2023, from Google trends data (low search volume regions included).
Google Trends8 reported a decrease in searches regarding PE between 2004 and today (Fig. 2).
![The global search interest over time for premature ejaculation from 2004 to 2023 using Google trends data.](https://cdn.statically.io/img/oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/smoa/12/3/10.1093_sexmed_qfae036/1/m_qfae036f2.jpeg?Expires=1724368167&Signature=BeLxTesuR-in1aN6g4g08JNRqdTYGE0qbLO7kSMK5cIbqIoJWpQrYnLyN6q6rIH96n5UigRQ4iOtoBbtBmIazx7OivjAp7dcPeOAbcOaXA~MVd1EyjYsXJ8GYqtEPYI31SXse8phio7m-fcMpq5tbdnueH~65rxyBtpWRHDc0WGWe5Zkl9qxeWLa2YklftsdV3useqqnU9t2nn7gqrvgXpuf74cR8tKfX8cSf7dAiOxGfcxMhu1fsLBLKFQX1-nfMziTLc7J5i4ei5-J-QIU5MS2Oz-vpc1Q-NVEWIzJUbo~4nj2K~9GAq5AoCpyeyUUYFVvfkZh9FgN3APLxvJxDw__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
The global search interest over time for premature ejaculation from 2004 to 2023 using Google trends data.
The mean scores of the passages were EQIP 45.93 ± 4.34, FKGL 15.68 ± 1.67, FRES 15.8 ± 8.73, and DISCERN 38.1 ± 3.78. These results are presented in Table 2.
Summary of statistical measures for EQIP, FRES, FKGL, DISCERN and percentages of categories.
Parameter . | Minimum . | Maximum . | Mean . | SD . |
---|---|---|---|---|
EQIP | 41.25 | 55.95 | 45.93 | 4.34 |
FKGL | 12.7 | 18.8 | 15.68 | 1.67 |
FKRE | 3.3 | 37.4 | 15.8 | 8.73 |
DISCERN | 31 | 43.5 | 38.1 | 3.78 |
Percentage and Number (out of 22) | ||||
Categories Discharge or Aftercare Condition or Illness Miscellaneous | 68.2% 13.6% 18.2% | 15 3 4 |
Parameter . | Minimum . | Maximum . | Mean . | SD . |
---|---|---|---|---|
EQIP | 41.25 | 55.95 | 45.93 | 4.34 |
FKGL | 12.7 | 18.8 | 15.68 | 1.67 |
FKRE | 3.3 | 37.4 | 15.8 | 8.73 |
DISCERN | 31 | 43.5 | 38.1 | 3.78 |
Percentage and Number (out of 22) | ||||
Categories Discharge or Aftercare Condition or Illness Miscellaneous | 68.2% 13.6% 18.2% | 15 3 4 |
Abbreviations: EQIP, Ensuring Quality Information for Patients; FKGL, Flesch Kincaid Grade Level; FKRE, Flesch Reading Ease Score.
Summary of statistical measures for EQIP, FRES, FKGL, DISCERN and percentages of categories.
Parameter . | Minimum . | Maximum . | Mean . | SD . |
---|---|---|---|---|
EQIP | 41.25 | 55.95 | 45.93 | 4.34 |
FKGL | 12.7 | 18.8 | 15.68 | 1.67 |
FKRE | 3.3 | 37.4 | 15.8 | 8.73 |
DISCERN | 31 | 43.5 | 38.1 | 3.78 |
Percentage and Number (out of 22) | ||||
Categories Discharge or Aftercare Condition or Illness Miscellaneous | 68.2% 13.6% 18.2% | 15 3 4 |
Parameter . | Minimum . | Maximum . | Mean . | SD . |
---|---|---|---|---|
EQIP | 41.25 | 55.95 | 45.93 | 4.34 |
FKGL | 12.7 | 18.8 | 15.68 | 1.67 |
FKRE | 3.3 | 37.4 | 15.8 | 8.73 |
DISCERN | 31 | 43.5 | 38.1 | 3.78 |
Percentage and Number (out of 22) | ||||
Categories Discharge or Aftercare Condition or Illness Miscellaneous | 68.2% 13.6% 18.2% | 15 3 4 |
Abbreviations: EQIP, Ensuring Quality Information for Patients; FKGL, Flesch Kincaid Grade Level; FKRE, Flesch Reading Ease Score.
The EIQP scores of the 3 categories are similar (P = .609). The FKGL of the discharge or aftercare group’s scores is significantly lower than the others (P = .006), while FRES and DISCERN are statistically higher (P = .021, and P = .008, respectively; Table 3).
Summary of statistical comparison of EQIP, FRES, FKGL, and DISCERN scores between the categories of EQIP.
Parameters, mean (SD) . | Discharge or aftercare . | Condition or illness . | Miscellaneous . | P value . |
---|---|---|---|---|
EQIP | 46.2 (4.9) | 44.1 (3.2) | 46.5 (2.9) | .609 |
FKGL, | 14.9 (1.2) | 17.9 (0.8) | 16.9 (1.4) | .006 |
FRES | 19.1 (8.2) | 8.2 (0.8) | 9.0 (6.6) | .021 |
DISCERN | 39.8 (2.9) | 32.8 (2.0) | 35.6 (2.6) | .008 |
Parameters, mean (SD) . | Discharge or aftercare . | Condition or illness . | Miscellaneous . | P value . |
---|---|---|---|---|
EQIP | 46.2 (4.9) | 44.1 (3.2) | 46.5 (2.9) | .609 |
FKGL, | 14.9 (1.2) | 17.9 (0.8) | 16.9 (1.4) | .006 |
FRES | 19.1 (8.2) | 8.2 (0.8) | 9.0 (6.6) | .021 |
DISCERN | 39.8 (2.9) | 32.8 (2.0) | 35.6 (2.6) | .008 |
Abbreviations: EQIP, Ensuring Quality Information for Patients; FKGL, Flesch Kincaid Grade Level; FRES, Flesch Reading Ease Score.
Summary of statistical comparison of EQIP, FRES, FKGL, and DISCERN scores between the categories of EQIP.
Parameters, mean (SD) . | Discharge or aftercare . | Condition or illness . | Miscellaneous . | P value . |
---|---|---|---|---|
EQIP | 46.2 (4.9) | 44.1 (3.2) | 46.5 (2.9) | .609 |
FKGL, | 14.9 (1.2) | 17.9 (0.8) | 16.9 (1.4) | .006 |
FRES | 19.1 (8.2) | 8.2 (0.8) | 9.0 (6.6) | .021 |
DISCERN | 39.8 (2.9) | 32.8 (2.0) | 35.6 (2.6) | .008 |
Parameters, mean (SD) . | Discharge or aftercare . | Condition or illness . | Miscellaneous . | P value . |
---|---|---|---|---|
EQIP | 46.2 (4.9) | 44.1 (3.2) | 46.5 (2.9) | .609 |
FKGL, | 14.9 (1.2) | 17.9 (0.8) | 16.9 (1.4) | .006 |
FRES | 19.1 (8.2) | 8.2 (0.8) | 9.0 (6.6) | .021 |
DISCERN | 39.8 (2.9) | 32.8 (2.0) | 35.6 (2.6) | .008 |
Abbreviations: EQIP, Ensuring Quality Information for Patients; FKGL, Flesch Kincaid Grade Level; FRES, Flesch Reading Ease Score.
Discussion
The integration of artificial intelligence (AI) into the healthcare sector seems to be unavoidable. It may be used in several healthcare contexts, such as clinical practice, research studies, patient education, peer reviews,17 and surgical technologies.18 ChatGPT has emerged as a potent tool for patients and clinicians alike, facilitating the acquisition and dissemination of information. As academic urologists, we need to explore the impact of developing AI technologies to optimize patient treatment and advance research efforts. With the increasing amount of medical information available on the Internet and social media, we need effective methods for assessing the quality and readability of this material so that we can guarantee that patients are getting accurate and suitable information.
Studies show that PE rates differ worldwide. Three of the countries with the most ChatpGPT searches related to are located in Africa. There are few studies on the prevalence of PE in the African population, although PE was reported in 64.7% of 300 men studied in Ghana19 and in 37.1% of Somalian men in another study.20 When we look at search trends, we currently see interest almost all over the world. This shows that PE is a widespread problem, and people try to satisfy their curiosity about various topics on the Internet. Around 30% of men between the ages of 18 and 59 years have difficulties with PE; however, feelings of shame and humiliation sometimes hinder them from openly addressing this delicate matter with their doctors.21 This situation leads men with PE to look on the Internet, which is frequently accessed by everyone and is increasingly used to solve medical problems. Therefore, most of the searches performed by the study participants concerned the treatment of PE. The decrease in searches over the years noted by Google Trends may also be due to Internet-based language models that have been introduced as an alternative to this platform as a result of the developments in AI technologies developed in recent years.
This research is to our knowledge the first study undertaken on PE matters with the purpose of contributing to the current literature on this topic and the first research endeavor aimed at investigating the quality and comprehensibility of information with a specific focus on replies generated by ChatGPT in relation to the keywords most frequently requested in queries about PE. The results indicate that the texts produced by ChatGPT are of highly questionable quality, are notably difficult to read and understand, and are of such literary complexity that they can be comprehended only by highly educated people.
There are obvious differences between the medical literature and ChatGPT texts. The outcomes of this study clearly indicate a need for improvement in the quality of texts produced by ChatGPT. For this enhancement to occur, many procedures might be used. Enabling ChatGPT to access medical literature and research might expand its knowledge base, enhancing its ability to produce credible and well-informed replies on health-related subjects. Furthermore, if specific factors that prioritize health information are integrated into the training of AI models like ChatGPT, their capacity to provide contextually suitable and medically precise replies may be significantly improved.
The clarity and understandability of health-related information have a vital effect on patients’ knowledge and their ability to make informed choices.22 The texts generated by ChatGPT regarding PE are difficult to read and understand, which implies that ChatGPT’s efficacy in communicating intricate medical information to users has constraints that may have an effect on the education of patients, the decision-making process, and the general understanding of information linked to ejaculation problems. Various strategies can be employed to improve the readability of ChatGPT-generated content. These include simplifying sentence structures, using clear and straightforward language, providing clear explanations, organizing content in a logical manner, incorporating visual elements, conducting user testing, collecting feedback, and utilizing readability metrics. To increase ChatGPT’s medical repository, it is necessary to rely on reputable scientific sources, establish a supervisory scientific board consisting of specialists, and engage expert teams to improve the clarity of medical terminology. Improving the accessibility and clarity of health information produced by ChatGPT has the capacity to empower patients, allow well-informed decision-making, and enhance overall patient outcomes, particularly in the setting of andrological diseases like PE.
The usual reading level for health-related material is grade 8 or below,23 but ChatGPT-generated material requires readers to have an educational level comparable to about 14-15 years of formal schooling. This finding highlights a discrepancy between the intricacy of the produced texts and the optimal degree of readability required for the efficient delivery of health information. For the purpose of enhancing readability, ChatGPT could adopt tactics such as offering specific instructions on composing health-related messages, augmenting training data with material that matches proper readability levels, and integrating methods for receiving feedback.
Although the use of online platforms like ChatGPT to acquire medical information is increasing, it cannot replace comprehensive medical examinations and explanations given by certified healthcare experts. Although Internet sites may provide helpful information, they cannot provide the tailored and thorough evaluation required for precise diagnosis and therapy. Moreover, the establishment of a patient–doctor connection is crucial to the provision of thorough and personalized treatment, a dimension that cannot be completely recreated via mere online contact, especially in regard to andrological diseases that are private and might be hard for the patients to discuss.24 The intricate characteristics of medical disorders highlight the need for a customized assessment, taking into account the patient’s medical background, symptoms, and other risk factors. This approach emphasizes the indispensable significance of an in-person medical evaluation in contrast to dependence on Internet-based sources of information. A patient-doctor connection is crucial to accounting for the medical backgrounds, symptoms, and possible comorbidities of each patient. Furthermore, the social backgrounds of patients and their families must be accounted for when medical guidance is offered. A detailed anamnesis and physical examination, laboratory examinations, and imaging methods are very important in the evaluation of andrological diseases, especially PE.
This study has several limitations. First, the study incorporated only the first 25 relevant terms, which might possibly have restricted its scope unduly. Further research endeavors that use more extensive aggregations of keywords might provide a more thorough and nuanced understanding of the subject matter. Furthermore, this research focused solely on English terms, and the regions that provided the most searches were in Africa, which may have affected the results relevance. Integrating searches performed in several languages would enhance the diversity and comprehensiveness of the search results. An increase in the competency of ChatGPT with regard to medical knowledge may be achieved by using reliable scientific sources, forming a supervisory scientific board, and involving expert teams to enhance the accessibility of medical terminology.
Conclusion
The findings of this research highlight substantial apprehensions about the caliber, readability, and understandability of texts produced using ChatGPT about PE. These texts demonstrate a degree of complexity that exceeds the recommended level of readability for successful health communication. While it is important to have easily understandable and correct health information for the well-being of patients so they can make informed decisions and follow treatment plans, it is critical to acknowledge that medical experts should be the ones to offer proper medical advice. Currently, ChatGPT use is incapable of substituting for a physician consultation for PE.
Author contributions
M.F.Ş.: Writing–Original Draft Preparation, Project Administration. A.K.: Conceptualization, Investigation, Writing–Review & Editing; E.C.T.: Project Administration. R.Ö.: Conceptualization, Investigation, Ç.D.: Writing– Original Draft Preparation, Writing– Review & Editing, M.A.: Writing–Review & Editing, Funding Acquisition. C.Y.: Conceptualization, Supervision, Writing–Review & Editing, Funding Acquisition.
Funding
None.
Conflicts of interest
None of the authors received any type of financial support that could be considered a potential conflict of interest regarding the manuscript or its submission.
Scientific responsibility statement
The authors declare that they are responsible for the article’s scientific content, including study design, data collection, analysis and interpretation, writing, some of the main line, or all of the preparation and scientific review of the contents and approval of the final version of the article.
Animal and human rights statement
All procedures performed in this study were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. No animal or human studies were carried out by the authors for this article.
Ethical statement
No ethical approval was needed because this is not a human study, but only online information was used.
References
Ghanem D, Covarrubias O, Harris A, et al.
Golan R, Reddy R, Deebel NA.
Golan R, Reddy R, Muthigi A.
Amidu N, Owiredu WK, Woode E, et al.
Mohamed AH, Mohamud HA, Yasar A. The prevalence of premature ejaculation and its relationship with polygamous men: a cross-sectional observational study at a tertiary hospital in Somalia.
Crowdis M, Leslie SW, Nazir S.
Khan B, Fatima H, Qureshi A, et al.
Michael O, Emma T, Jenny J, et al.