(Hallucinations)
24,419 results
  • Evaluation of AI Citation Accuracy in Anterior Segment Research. [Journal Article]
    Cesk Slov Oftalmol. 2026; 82(Ahead of Print):1-5.Civelekler M, Çıtırık MCS
  • CONCLUSIONS: This pilot study indicates that contemporary AI models, particularly those like DeepSeek, show potential in assisting with citation generation. However, the observed error rates, including instances of hallucination, remain substantial. These findings underscore that rigorous human verification is indispensable when using AI for academic referencing in specialized medical fields, and highlight the need for continuous, version-specific benchmarking as these tools evolve.
  • Comparative evaluation of seven large language models in providing home phototherapy care guidance for neonatal hyperbilirubinemia. [Journal Article]
    Transl Pediatr. 2026 Apr 30; 15(4):121.Huang S, Huang Q, … Wei HTP
  • CONCLUSIONS: DeepSeek-R1 and ChatGPT-4o achieved near-expert accuracy, potentially reliable for basic queries under oversight. Claude 3.5 Sonnet and GLM-4 showed moderate performance with notable gaps. Copilot, Gemini, and ERNIE demonstrated critical errors. However, all models exhibited hallucinated references and a lack of patient-specific reasoning. LLMs should not replace professional medical judgment in treatment decisions. Safe implementation requires restricting high-performing LLMs to narrowly defined procedural questions only, with mandatory physician verification.