IBM Watson was once seen as a big AI project to help with cancer treatment choices. But it did not provide safe or useful advice. This happened because it used made-up data and did not have enough different real patient records. Since it was not trained on real cases, it gave wrong advice that might have hurt patients. After spending a lot of money, the project stopped.
This failure shows why it is important to use good and complete data to teach AI in healthcare. Using only limited or fake data can cause wrong results. This may make doctors and patients lose trust.
The Epic Sepsis Model is a tool used in some U.S. hospitals to find patients who might have sepsis, a serious infection. But in a study over 15 hospitals, the model missed 67% of patients with sepsis. It did not help improve treatments or patient results compared to hospitals that did not use it. This means the AI was not reliable enough to guide doctor decisions.
At the start of the COVID-19 pandemic, many AI models were created to help detect infections from scan images. But many of these were trained with bad data. For example, one AI learned to spot how the patient was positioned, not how bad the infection was. This caused wrong predictions. This happened because the data was not correct or was labeled wrong. Such mistakes can delay correct diagnosis, lead to wrong treatments, and cause extra worry for patients.
Apart from medical decisions, AI chatbots and virtual helpers have also had problems. Air Canada’s AI chatbot gave wrong details about special fares for bereavement. This led to a court case and fines. McDonald’s stopped its AI voice ordering pilot in 2024 after many order mistakes upset customers.
Even though these examples are not in healthcare, they show how AI mistakes in talking to customers can break trust and cause money loss. Healthcare groups using AI to talk with patients must make sure it is correct and trustworthy to avoid the same problems.
Amazon’s AI tool for hiring, while not healthcare-related, showed a serious problem useful for health AI users. The tool learned to be unfair to women because it was taught on mostly male resumes from the past decade. It was stopped in 2018 after showing bias issues.
For health providers, this warning shows the risk of using AI trained on incomplete or biased data. AI bias can cause unfair care or treatment of workers and patients. Ethical rules and fairness must come first when building and using AI.
Most AI failures are linked to clinical decision support, but AI is also used more in managing office work in healthcare. This use is important because it helps daily work in medical offices and improves patient experience. It allows healthcare workers to focus more on patient care.
One example is Simbo AI, a company that uses AI for front-office phone automation and answering. Simbo AI’s system manages phone calls in medical offices. It handles scheduling, patient questions, prescription refills, and other common calls. This cuts wait times and helps office staff while giving patients timely and correct info.
Why is AI front-office automation important?
Still, AI front-office systems must learn from clinical AI lessons: they need constant testing, accurate data work, and human checks to find mistakes fast. Companies like Simbo AI must keep high standards for speech recognition and understanding language well to stop patient frustration or wrong info.
The United States leads many countries in using AI for clinical and office work. Australia, for example, struggles with money and infrastructure to test clinical AI. In the U.S., 42% of Chief Information Officers named AI their top tech priority for 2025. This makes the issue very important.
U.S. healthcare must learn from past AI failures in and out of medicine. Projects like IBM Watson for Oncology, Amazon’s biased hiring AI, and the Epic Sepsis Model show that data, testing, and ethics cannot be skipped.
Good leadership from hospital chiefs and IT managers can help future AI tools work better. They should require AI makers to be clear, watch over strong tests, and follow rules. This will lower risks and help healthcare get real benefits from AI.
AI use in healthcare will keep growing in both medical decisions and daily office tasks. Knowing where past AI projects went wrong helps U.S. healthcare groups make safer and smarter choices about AI.
With better data, strict testing, ethical care, and smart workflow use, AI can help medical offices give better care and work better.
Experience from around the world shows that while AI can help, it must be managed carefully and used responsibly. This stops costly mistakes that break trust and hurt patient safety. By learning these lessons, American healthcare leaders can build more reliable, efficient, and fair AI systems for the future.
Clinical AI refers to machine learning algorithms that utilize real-time electronic medical record (EMR) data to assist healthcare practitioners in making treatment, prognostic, or diagnostic decisions.
Despite potential benefits, Australian hospitals largely avoid clinical AI due to ethical, privacy, and safety concerns, as well as a lack of infrastructure for implementation.
Notable failures include the Epic Sepsis Model missing 67% of septic patients and IBM Watson’s struggle to deliver practical solutions after significant investment.
Certain implemented sepsis prediction models in international hospitals have reported reduced mortality rates, demonstrating AI’s potential benefits in clinical settings.
The SALIENT framework provides an end-to-end approach for testing and safely integrating AI into clinical practice, incorporating stages like problem definition and prospective evaluation.
Prospective trials necessitate an IT infrastructure that supports live EMR data access, allowing for comprehensive testing of AI interventions in real-time clinical environments.
Australia’s healthcare lacks the necessary infrastructure and funding for prospective AI trials, hindering the translation of research into practical applications.
The absence of clear regulatory frameworks for AI may create uncertainty among healthcare providers, impacting their willingness to adopt AI solutions.
Public funding is essential to develop the infrastructure needed for prospective trials, enabling hospitals to safely evaluate and implement AI systems.
International reporting standards like TRIPOD and CONSORT- AI provide detailed guidelines for evaluating AI, promoting transparency and ensuring that AI applications are rigorously tested before implementation.