AI is being used in ophthalmology to help doctors make better diagnoses, plan treatments, and manage patients more easily. Some AI programs, like large language models such as GPT-4, can handle complicated medical tasks involving eye diseases like glaucoma and retinal problems. A recent study done by the New York Eye and Ear Infirmary of Mount Sinai, published in JAMA Ophthalmology, showed that GPT-4 did better than glaucoma experts at diagnosing cases and was as good as retina experts in managing them.
In the study, 20 cases of glaucoma and retinal disease were given to both AI and human specialists. The AI scored higher in diagnosis accuracy than the glaucoma and retina specialists. It also gave more complete answers, meaning it didn’t just analyze but offered full medical reasoning. This shows that AI might help doctors make decisions faster, especially in places that don’t have many eye care experts.
Dr. Louis R. Pasquale, who worked on the study, compared AI’s effect on doctors to how Grammarly helps writers improve their work. Dr. Andy Huang, the lead author, said the AI’s performance was surprising and shows it could help patients get expert advice faster.
Even though these results are encouraging, the researchers say more testing is needed before AI is used widely in eye clinics. They worry about how safe, reliable, and flexible AI tools are in many different medical settings.
AI does well in research, but there is still a big difference between creating AI systems and using them in real eye care clinics in the U.S. Research from Elsevier Ltd. points out that before AI is used in clinics, it needs strong systems to make sure it works well, is safe, and can be watched closely over time.
Important qualities for trustworthy AI include:
Making AI systems that meet these needs is not just the job of AI developers. It requires teamwork among eye doctors, hospitals, government agencies, health insurers, and patients. This teamwork should start when AI is being made and continue after it is in use, with regular checking of how well it works.
Hospital leaders and clinic managers have to look not only at the AI’s technical skills but also how the AI will be managed and supported in everyday work.
A group of eye doctors and AI specialists created a guide published in Ophthalmology Science (2025) to help doctors and clinic managers understand how to check if AI systems are ready to use. This guide gives a clear way to review AI tools.
The guide says people should carefully think about:
Clinic owners and IT managers can use these ideas to set safety and benefits rules when choosing AI tools.
Studies show that while AI systems like GPT-4 can do well at diagnosing eye problems like glaucoma and retina diseases, they do not replace the judgment of doctors. They work as helpers to doctors, making notes easier, standardizing records, offering advice, and making work flow better.
Because patient cases are very different and clinics can be busy and complex, AI must handle unusual cases and avoid errors. That is why U.S. hospitals and clinics require strong testing before using AI tools, especially in eye care where mistakes can cause vision loss.
More research is also needed to check how AI affects long-term patient safety, legal questions, and outcomes before AI tools become regular parts of care.
AI use in eye care goes beyond diagnosis. It also helps automate front-office and administrative work. For example, companies like Simbo AI offer AI phone systems that help patients reach clinics and manage appointments. This can lower the workload for office staff, reduce wait times on calls, and improve how patients experience the clinic.
By using AI to handle tasks like scheduling, reminders, and questions, staff can focus more on clinical work while the AI handles routine office work.
These systems also help clinics handle problems like:
Putting together AI that helps with both patient care and office work creates a smoother system. This helps clinics use resources better and gives patients steady care, especially those with complex eye problems.
However, just like AI for diagnosis, workflow tools need careful testing for safety, data protection, and easy fitting into clinic systems. Medical practice leaders in the U.S. must review these tools well before fully using them.
For clinic administrators and IT managers in the U.S., where health care has strict rules and serves diverse people, these points are important when looking at AI tools:
Studies and expert opinions agree that clinics should be careful when adopting AI. Research from New York Eye and Ear Infirmary of Mount Sinai shows that AI like GPT-4 works well in tests but needs more diverse and larger evaluations before using it daily in clinics.
Research by Cristina González-Gonzalo and others points out that trust in AI comes from thoughtful design and keeping many people involved over time. Angela McCarthy and Ives Valenzuela, who work with eye doctors and AI experts, say that a clear and science-based review method is important before using AI tools for patient care decisions.
AI is playing a growing role in eye care, along with automation that changes how clinics work. This offers useful options for U.S. health providers. Still, clinic leaders and IT managers must choose AI systems that have been tested carefully, are safe, and fit well into their operations. Only by following this careful method can eye clinics use AI technology while keeping good care for patients.
The study found that AI, specifically the GPT-4 chatbot, was able to match or outperform human specialists in the management of glaucoma and retinal disease based on diagnostic accuracy and comprehensiveness.
Researchers presented ophthalmological case management questions to the GPT-4 chatbot and compared its responses with those of fellowship-trained glaucoma and retina specialists, scoring them on a Likert scale for accuracy and completeness.
The mean rank for accuracy was 506.2 for the LLM chatbot vs. 403.4 for glaucoma specialists, and for completeness it was 528.3 vs. 398.7, showing significant improvements in the AI’s performance.
The Dunn test showed significant differences in ratings between the AI and specialist performance for diagnostic accuracy and completeness, except in the case of specialist vs. trainee ratings.
The study suggests that AI could play a significant role in diagnosing and managing glaucoma and retinal diseases, potentially serving as a supportive tool for eyecare providers.
AI tools like GPT-4 can provide guidance on documentation and clinical decision-making, helping ophthalmologists improve their clinical practices in patient management.
Dr. Pasquale highlighted that AI’s proficiency in handling patient cases was surprising and that it could enhance clinician skills, similar to how Grammarly aids writers.
Dr. Huang noted that the performance of GPT-4 was eye-opening and indicated the massive potential for AI systems in enhancing clinical practices for seasoned specialists.
The lead author acknowledged that while the findings are promising, additional testing is needed to validate the AI’s performance before it can be fully integrated into clinical settings.
Integration of AI could lead to faster access to expert advice for patients, resulting in more informed decision-making and potentially improved treatment outcomes.