The field of healthcare has seen big changes recently, especially with the use of artificial intelligence (AI) in everyday clinical work. AI has many uses, but medical imaging is one area where it could help doctors make better diagnoses and work more efficiently. For medical practice managers, owners, and IT staff in the United States, knowing about how multimodal AI models work in imaging is important. This knowledge helps them make smart choices about buying technology, adopting new tools, and changing workflows.
This article looks at the current role of AI in diagnostic imaging, focusing on multimodal AI models—these are models that can analyze both images and text. It also talks about challenges and opportunities when using this technology in U.S. healthcare, especially how AI might improve front-office processes and administrative tasks through automation.
Multimodal AI means artificial intelligence that can process different types of information at the same time. In medical imaging, this usually means combining images like X-rays, MRIs, or CT scans with text like patient histories, lab results, and doctors’ notes. This is different from older AI models that only looked at images.
A recent study by the National Institutes of Health (NIH) tested a multimodal AI called GPT-4V. This AI can understand both pictures and text. The test used medical quizzes based on clinical images, including 207 tough questions from the New England Journal of Medicine Image Challenge.
The results showed GPT-4V was good at picking the right diagnoses and did better than doctors who could not use any help. But the AI had trouble explaining why it chose those answers. Sometimes it even described images wrongly, even if the final diagnosis was correct. Doctors who could look at reference materials did better than the AI, especially on hard questions. This shows that human experience with information still matters a lot. Stephen Sherry, Ph.D., Acting Director of the National Library of Medicine, said AI is helpful as a tool but not ready to replace expert doctors.
These findings show that AI models can work fast and spot patterns well. But they still struggle to understand complex medical cases and explain their answers clearly.
In medical imaging, convolutional neural networks (CNNs) have been popular AI tools. They do a good job at tasks like sorting images. A study that compared CNNs with large language models (LLMs) like GPT-4o and Llama3.2-vision found that CNNs were more accurate on many image types, like chest X-rays, brain MRIs, and CT scans.
CNNs got an accuracy of 83% on chest X-rays and almost perfect 98% on brain MRIs. But GPT-4o and Llama3.2-vision scored much lower, even as low as 22% on CT scan classification. CNNs also used less power and ran faster than LLMs. These are important for hospitals that need reliable and efficient systems.
Even though LLMs were less accurate, they often showed high confidence in their answers. This is called “overconfidence,” which means the AI seemed sure but was sometimes wrong. This gap means LLMs need better tuning to be trusted in medical use.
Researchers are trying new ways to improve LLM performance by filtering data better. One method raised GPT-4o’s accuracy on chest X-rays from 62% to 82%, while also cutting down time and energy. This suggests that in the future, combining CNNs and LLMs could give the benefits of both accuracy and reasoning.
A review of 30 studies since 2019 highlights four main areas where AI affects healthcare:
These areas show AI helps not just with diagnosis but also with managing healthcare resources and improving patient care.
Even though AI shows promise, some challenges must be solved before using AI widely in U.S. medical imaging departments:
AI is also being used to automate office tasks and admin work in medical practices. For example, companies like Simbo AI use AI to run phone systems and answer calls for healthcare groups. This helps reduce the work on staff, speeds up how fast patients get answers, and lowers mistakes in administration.
Using AI in front-office tasks along with diagnostic AI can improve the whole workflow:
For healthcare managers in the U.S., investing in AI for both front-office automation and diagnostic tasks can improve patient experience, operation speed, and control costs.
New Generalist Medical AI (GMAI) models are becoming able to do many medical jobs at once. They can handle images, lab results, genetic info, and clinical text all together. Researchers from places like Stanford, Harvard, and Yale found that GMAI can explain its reasoning in written or spoken form.
This wider ability could combine imaging AI with other diagnosis and management tasks, making GMAI a useful tool for healthcare workers. But these models also raise questions about how current validation and regulation will keep up with this new kind of AI in the U.S.
Healthcare leaders in the United States should think about these points when deciding how to adopt multimodal AI in diagnostic imaging:
In short, multimodal AI models might change diagnostic imaging in U.S. healthcare a lot. AI’s strength in looking at images combined with clinical data can make diagnoses more correct, cut mistakes, and speed up choices. But these good points depend on careful use, good staff training, and constant checking of AI alongside human skills. By managing these things well, practice managers, owners, and IT staff can use AI to improve patient care and handle daily work better.
The NIH study found that the AI model GPT-4V performed well in diagnosing medical images but struggled with explaining its reasoning, highlighting both its potential and limitations in clinical settings.
The AI selected correct diagnoses more frequently than physicians in closed-book settings, while physicians using open-book resources performed better, particularly on difficult questions.
The AI often misinterpreted medical images and failed to correlate conditions despite accurate diagnoses, demonstrating gaps in its interpretative capabilities.
It’s crucial to assess AI’s strengths and weaknesses to understand its role in improving clinical decision-making and ensure effective integration into healthcare.
The study was led by researchers from NIH’s National Library of Medicine (NLM) in collaboration with several prestigious medical institutions including Weill Cornell Medicine.
The tested model was GPT-4V, a multimodal AI capable of processing both text and image data, relevant to diagnosing medical conditions.
NLM supports biomedical informatics and data science research, aiming to improve the processing, storage, and communication of health information.
Despite AI’s capabilities, human experience is essential for accurately diagnosing patients, as AI may lack contextual understanding necessary for correct interpretations.
Further research is required to compare AI capabilities with those of human physicians to fully understand its potential in clinical settings.
The findings suggest that while AI can enhance diagnosis speed, its current limitations necessitate careful evaluation before widespread implementation in healthcare.