The United States has patients who speak many different languages. English is the main language in most medical places, but many patients prefer or need care in languages like Spanish, Chinese, Tagalog, and Vietnamese. This diversity in language can cause problems for healthcare workers, especially when they need to get accurate medical histories, understand symptoms, or make diagnoses.
When language barriers cause miscommunication, it can lead to wrong diagnoses, delays in treatment, and unhappy patients. Doctors and nurses rely on clear communication to gather clinical information, decide on tests, and understand results. Busy hospitals and clinics often have time limits, which can add pressure and lead to mistakes, especially for less experienced staff. AI can help by supporting better communication and aiding in decision-making across different languages.
The PERFORM Study looked at how AI large language models (LLMs) compare to human doctors in obstetrics and gynecology (OB-GYN) for diagnostic tasks. The study tested eight AI models and 24 OB-GYN residents with up to five years of training. They used 60 clinical cases shown in English and Italian. Some cases were timed and others were not.
The study found that AI had a higher overall diagnostic accuracy of 73.75%, while the residents scored 65.35%. This difference was significant, showing AI can do as well as or better than humans on these tasks.
AI kept its accuracy even when cases were given in different languages. Top AI systems like ChatGPT-01-preview, GPT4o, and Claude Sonnet 3.5 got about 88.33% correct, with only a small drop of 6.67% caused by language differences. This means AI tools can help doctors diagnose patients who speak different languages, which is important in the U.S. where many patients do not speak English as their first language.
Human residents were more affected by outside factors, especially time pressure. Their accuracy dropped from 73.2% to 56.5% when timed. This shows how pressure can hurt performance. AI did not drop as much, staying more reliable even when decisions had to be quick during busy shifts.
The PERFORM Study also showed how AI can help new doctors. First-year residents had only about 44.7% accuracy, much lower than fifth-year residents who scored 87.1%. When AI was used, the new doctors’ accuracy improved by nearly 30%, which is a big change.
This means AI can guide less experienced clinicians and help them check their work. Hospital managers can use this to make frontline diagnoses more reliable. Junior doctors can get help to confirm their ideas or think of other possible diagnoses. This lowers mistakes and improves patient safety, especially when things are busy.
Since many U.S. hospitals have fewer doctors, they often rely on newer staff. AI supports them without needing constant supervision from senior doctors. This allows hospitals to use staff better and keep diagnosis quality high.
For healthcare managers and owners, the data shows that AI can make clinic work easier and solve language challenges. Patients who do not speak English well often find it hard to get care that fits their language needs. AI systems that work with many languages can help during initial patient visits, triage, and recordkeeping.
IT managers are important for setting up AI safely. They must make sure AI works well with current electronic health record (EHR) systems and office software like scheduling and communication tools. IT staff must also protect patient data, follow rules like HIPAA, and keep systems running smoothly to avoid interruptions in care.
Hospitals and clinics should see AI not just as a tool for diagnosis but as part of using digital tools to automate tasks and help patients who speak many languages.
Automation of Front Desk and Patient Communication
AI can help automate front-office work. AI voice systems and chatbots can handle appointment booking, patient check-in, reminders, and answer common questions in multiple languages without needing humans. This reduces staff workload, cuts wait times, and lowers chances of misunderstandings caused by language differences.
For example, Simbo AI offers front-office phone systems using AI. An AI phone service that understands many languages can give patients clear information no matter what language they use. It also lets medical staff focus on patient care instead of phone calls.
Clinical Decision Support Systems
AI models that are good at diagnosis can be added to clinical decision support systems (CDSS). These systems help doctors by reviewing clinical information and suggesting possible diagnoses, treatments, or next steps. AI CDSS can improve care quality by preventing mistakes caused by miscommunication in different languages.
Reducing Workload and Time Pressure
The PERFORM Study showed that human doctors’ performance goes down under time pressure, which is common in busy hospitals. AI handles time pressure better. It can act like a second opinion and help doctors make fast, correct decisions. Automated support reduces mental stress and helps prevent burnout from quick, complex decision-making.
Supporting Training and Continuous Learning
AI can be a training tool for hospitals with residency programs or ongoing education. It can simulate real diagnostic cases in multiple languages and give feedback. This helps doctors at different skill levels to learn. This kind of help is useful in clinics with many patients who speak various languages.
The PERFORM Study shows that AI is becoming able to work well in different languages while keeping strong diagnostic skills. This is important in the U.S., where many patients speak languages other than English.
By using AI in front-office tasks and clinical decisions, healthcare providers can reduce mistakes caused by language problems and wrong diagnoses. New doctors gain help that builds their skills and confidence.
Hospitals and clinics, especially in cities with many languages, can use AI tools like Simbo AI’s phone automation and advanced diagnostic systems. These systems make work smoother and support fairer care for all patients, no matter their language.
This discussion shows that AI is a tool to assist doctors, not replace them. AI’s ability to work across languages and stay steady under time pressure may help improve healthcare in diverse U.S. communities. Healthcare managers, practice owners, and IT leaders should understand and invest in AI to improve efficiency and patient care in the changing healthcare environment.
The primary objective of the PERFORM Study was to systematically evaluate the performance of artificial intelligence (AI) large language models (LLMs) compared to obstetrics-gynecology residents in clinical decision-making, focusing on diagnostic accuracy and error patterns.
The study evaluated 8 AI LLMs and 24 obstetrics-gynecology residents across their first five years of training.
The primary outcome measure was diagnostic accuracy, while secondary endpoints included performance under time constraints and language impact.
AI LLMs reported an overall diagnostic accuracy of 73.75%, while residents achieved 65.35%, highlighting a significant difference (P<.001).
Residents exhibited a marked decline in diagnostic accuracy under time pressure, dropping from 73.2% to 56.5% adjusted accuracy.
Error pattern analysis indicated a moderate correlation between AI and human reasoning with a coefficient of r=0.666, suggesting similarities in decision-making processes.
AI LLMs provided the most significant enhancement in diagnostic accuracy for early-career residents, with an improvement of +29.7% (P<.001).
The AI systems demonstrated high cross-linguistic accuracy (88.33%) with minimal language impact, indicating robustness across different languages.
The residents assessed in the study were from all five years of training in obstetrics and gynecology.
The findings suggest that AI-enhanced decision-making may improve diagnostic consistency and reduce cognitive load, particularly benefitting junior residents in time-sensitive clinical settings.