JDSE

The Journal of Dental Sciences and Education deals with General Dentistry, Pediatric Dentistry, Restorative Dentistry, Orthodontics, Oral diagnosis and DentomaxilloFacial Radiology, Endodontics, Prosthetic Dentistry, Periodontology, Oral and Maxillofacial Surgery, Oral Implantology, Dental Education and other dentistry fields and accepts articles on these topics. Journal of Dental Science and Education publishes original research articles, review articles, case reports, editorial commentaries, letters to the editor, educational articles, and conference/meeting announcements.

EndNote Style
Index
Original Article
Comparative evaluation of the performance of AI-powered chatbots in answering basic sciences questions of the dentistry specialization examination
Aims: The aim of this study is to evaluate the correctness rate/probability of answers provided by AI-powered chatbots—Gemini Advanced 2.5 Pro, ChatGPT 4 omni, ChatGPT 5, and DeepSeek v3—to single-answer, multiple-choice basic sciences questions from the Dentistry Specialization Examination (DUS) administered between 2012 and 2025.
Methods: A total of 539 multiple-choice questions from the basic sciences section of the DUS from 2012 to 2025 were used. Each question was presented directly in a new session. The rates of correct/incorrect answers were calculated based on the subject of the question, the year it was asked, and the chatbot model. The rate and probability of incorrect answers were evaluated using chi-square and binary logistic regression analyses.
Results: The rate of incorrect answers was highest in 2012, with a significant decrease observed in subsequent years (p<0.05). Among the subjects, Anatomy had the highest rate of incorrect answers, while Pathology had the lowest (p<0.05). When comparing chatbot models, Gemini Advanced 2.5 Pro was found to have a statistically significantly lower error rate than ChatGPT-4 omni and DeepSeek v3 (p<0.05). In the regression analysis, the risk of providing an incorrect answer was statistically significantly higher for the years 2012 and 2018; for the fields of Anatomy, Physiology, and Microbiology; and for the ChatGPT-4 omni and DeepSeek v3 models, compared to their respective reference groups (p<0.05).
Conclusion: AI-powered chatbots provided more accurate answers to more recent questions. In the subject-based performance analysis, the rate of incorrect answers for Anatomy questions was high. In the chatbot comparison, Gemini Advanced 2.5 Pro produced more accurate answers than DeepSeek v3. While AI-powered chatbots can be a potential supplementary tool in the preparation process for DUS, their accuracy and competence across different subjects are limited.


1. Xu L, Sanders L, Li K, et al. Chatbot for health care and oncology applications using artificial intelligence and machine learning: systematic review. JMIR Cancer. 2021;7(4):e27850. doi:10.2196/27850
2. Adamopoulou E, Moussiades L. An overview of chatbot technology. IFIP international conference on artificial intelligence applications and innovations. Cham: Springer International Publishing. 2020;373-383. doi:10.1007/978-3-030-49186-4_31
3. Alhejaily A-MG. Artificial intelligence in healthcare. Biomed Rep. 2025; 22(1):1-8. doi:10.3892/br.2024.1889
4. Reyes LT, Knorst JK, Ortiz FR, et al. Machine learning in the diagnosis and prognostic prediction of dental caries: a systematic review. Caries Res. 2022;56(3):161-70. doi:10.1159/000524167
5. Wang YCC, Chen TL, Vinayahalingam S, et al. Artificial intelligence to assess dental findings from panoramic radiographs-a multinational study. arXiv preprint arXiv. 2025;250210277. doi:10.48550/arXiv.2502. 10277
6. Tyagi M, Jain S, Ranjan M, et al. Artificial intelligence tools in dentistry: a systematic review on their application and outcomes. Cureus. 2025; 17(5):e85062. doi:10.7759/cureus.85062
7. Farook TH, Jamayet NB, Abdullah JY, et al. Machine learning and intelligent diagnostics in dental and orofacial pain management: a systematic review. Pain Res Manag. 2021;2021(1):6659133. doi:10.1155/ 2021/6659133
8. Karobari MI, Adil AH, Basheer SN, et al. Evaluation of the diagnostic and prognostic accuracy of artificial intelligence in endodontic dentistry: a comprehensive review of literature. Comput Math Methods Med. 2023;2023(1):7049360. doi:10.1155/2023/7049360
9. Al-Amin M, Ali MS, Salam A, et al. History of generative artificial intelligence (AI) chatbots: past, present, and future development. arXiv preprint arXiv. 2024;2402.05122. doi:10.48550/arXiv.2402.05122
10. Esmailpour H, Rasaie V, Babaee Hemmati Y, et al. Performance of artificial intelligence chatbots in responding to the frequently asked questions of patients regarding dental prostheses. BMC Oral Health. 2025;25(1):574. doi:10.1186/s12903-025-05965-9
11. Lin C-C, Huang AYQ, Yang SJH. A review of AI-driven conversational chatbots implementation methodologies and challenges (1999-2022). Sustainability. 2023;15(5):4012. doi:10.3390/su15054012
12. Plevris V, Papazafeiropoulos G, Jiménez Rios A. Chatbots put to the test in math and logic problems: a comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard. AI. 2023;4(4):949-69.
13. ÖSYM. Dentistry Specialization Education Entrance Exam. © Presidency of the Republic of Türkiye measuring, selection and placement center. 2025.
14. Zhai X. ChatGPT user experience: Implications for education. Available at SSRN 2022;4312418. doi:10.2139/ssrn.4312418
15. OpenAI. Introducing GPT-5. 2025.
16. Mihalache A, Grad J, Patil NS, et al. Google Gemini and Bard artificial intelligence chatbot performance in ophthalmology knowledge assessment. Eye (Lond). 2024;38(13):2530- 2535. doi:10.1038/s41433-024-03067-4
17. Team G, Anil R, Borgeaud S, Alayrac J-B, et al. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv. 2023:231211805. doi:10.48550/arXiv.2312.11805
18. Conroy G, Mallapaty S. How China created AI model DeepSeek and shocked the world. Nature. 2025;638(8050):300-301. doi:10.1038/d41586-025-00259-0
19. Sadeq MA, Ghorab RMF, Ashry MH, et al. AI chatbots show promise but limitations on UK medical exam questions: a comparative performance study. Sci Rep. 2024;14(1):18859. doi:10.1038/s41598-024-68996-2
20. Şismanoğlu S, Çapan BS. Performance of artificial intelligence on Turkish dental specialization exam: can ChatGPT-4.0 and Gemini Advanced achieve comparable results to humans? BMC Med Educ. 2025; 25(1):214. doi:10.1186/s12909-024-06389-9
21. Çekiç EC, Tavşan O. Evaluating large language models using national endodontic specialty examination questions: are they ready for real-world dentistry? BMC Med Educ. 2025;25(1):1308. doi:10.1186/s12909-025-07896-z
22. Taşsöker M. ChatGPT-4 Omni’s superiority in answering multiple-choice oral radiology questions. BMC Oral Health. 2025;25(1):173. doi: 10.1186/s12903-025-05554-w
23. Aşık A, Kuru E. Analysis of ChatGPT’s answers to pedodontics questions asked in the dentistry specialization training entrance exam: cross-sectional study. Turkiye Klinikleri J Dent Sci. 2025;31(3):401-406. doi:10.5336/dentalsci.2024-107488
24. Bilgin AD, Ertan A. A comparative study of ChatGPT-3.5 and Gemini's performance of answering the prosthetic dentistry questions in dentistry specialty exam: cross-sectional study. Turkiye Klinikleri J Dent Sci. 2024;30(4):668-673. doi:10.5336/dentalsci.2024-104610
25. Çetiner EY. Comparative evaluation of ChatGPT-5 and Gemini 2.5 Pro in answering oral and maxillofacial surgery questions from dentistry specialization exams: a cross-sectional study. EurAsian J Oral Maxillofac Surg. 4(3):59-65.
26. Jalali P, Mohammad-Rahimi H, Wang F-M, et al. Performance of seven artificial intelligence chatbots on board-style endodontic questions. J Endod. 2025;51(10):1413-1419. doi:10.1016/j.joen.2025.06.014
27. Diniz-Freitas M, Diz-Dios P. DeepSeek: another step forward in the diagnosis of oral lesions. J Dent Sci. 2025;20(3):1904-1907. doi:10.1016/j.jds.2025.02.0
Volume 4, Issue 1, 2026
Page : 17-22
_Footer