Deploying LLMs in healthcare brings special risks. Healthcare data is very sensitive. The cloud systems used can be complex. Many healthcare groups use cloud platforms to run these AI models because they need strong computing power. But using public or hybrid clouds can create threats:
LLMs need security methods that go beyond normal protections. They must keep data safe and trustworthy at every step, from hardware to applications.
Attestation means checking that the computer systems running AI are trustworthy and intact. Usually, attestation is done only when the system starts, but healthcare needs it all the time.
Researchers Jianchang Su and Wei Zhang created a system for runtime attestation. This checks the system nonstop to make sure nothing bad happens during use. Their system works with Kubernetes, a tool that manages software containers in many healthcare clouds, and offers:
This ongoing attestation forms what they call an unbroken chain of trust. It ensures safe hardware, containers, and applications, protecting sensitive data from many threats.
This is important for US healthcare places that work with multiple cloud or software providers. It lowers risks from shared systems and keeps patient data safer.
Healthcare already secures data at rest and while moving. But data must also be safe while being used by AI models. This is a challenge.
Confidential computing protects data in use by running tasks inside secure hardware called Trusted Execution Environments (TEEs). These create secure areas where even cloud admins cannot see the data or code during use. This ensures:
Red Hat OpenShift uses confidential containers (CoCo) based on Kata Containers and CNCF standards. These run AI safely inside TEEs, even on public clouds.
NVIDIA’s NIM (NVIDIA Inference Microservices) works with OpenShift AI. NIM runs AI tasks fast on GPUs like the NVIDIA Hopper H100. When used with confidential containers, healthcare can run AI models securely and quickly.
Runtime attestation and confidential computing combined create strong AI security for healthcare. Red Hat and NVIDIA showed this by running NVIDIA NIM inside confidential containers on Azure confidential virtual machines with NVIDIA H100 GPUs.
Main parts of this system include:
This design helps healthcare by:
One real-world use for this secure AI is in front-office automation. AI helps with phone answering and first patient contact. Companies like Simbo AI use AI phone systems that help medical offices handle many calls efficiently and safely.
Secure AI lets Simbo AI provide voice and language services that follow healthcare data rules. This includes:
Data confidentiality and trusted execution make sure AI meets HIPAA and other laws. This helps IT teams avoid legal and financial problems from data leaks or mishandling.
Medical administrators, owners, and IT experts in healthcare should pick AI tools that use runtime attestation, confidential computing, and container integrity checks. These are now must-haves because:
Red Hat OpenShift AI with NVIDIA NIM and confidential containers is a modern way to handle these challenges. These tools help AI run in isolated, trusted places with ongoing checks.
For IT managers, this means smoother operations and easier teamwork between clinical and technical staff. For administrators, it means better confidence that patient data is safe all through the AI process.
These technologies include key features important for U.S. healthcare:
Together, these form layers of defense fitting the needs of U.S. medical groups that handle sensitive data and follow strict rules.
As AI improves, healthcare in the United States needs infrastructure that supports secure, reliable AI use. Runtime attestation, confidential computing, and integrity checks are important for this. They help safely add LLMs into front-office work, clinical tasks, and more. Using these methods lets healthcare benefit from AI while keeping patient data well protected.
Red Hat OpenShift AI is a flexible, scalable AI and ML platform that enables enterprises to create, train, and deliver AI applications at scale across hybrid cloud environments. It offers trusted, operationally consistent capabilities to develop, serve, and manage AI models, leveraging infrastructure automation and container orchestration to streamline AI workloads deployment and foster collaboration among data scientists, developers, and IT teams.
NVIDIA NIM is a cloud-native microservices inference engine optimized for generative AI, deployed as containerized microservices on Kubernetes clusters. Integrated with OpenShift AI, it provides a scalable, low-latency platform for deploying multiple AI models seamlessly, simplifying AI functionality integration into applications with minimal code changes, autoscaling, security updates, and unified monitoring across hybrid cloud infrastructures.
Confidential containers are isolated hardware enclave-based containers that protect data and code from privileged users including administrators by running workloads within trusted execution environments (TEEs). Built on Kata Containers and CNCF Confidential Containers standards, they secure data in use by preventing unauthorized access or modification during runtime, crucial for regulated industries handling sensitive data.
Confidential computing uses hardware-based TEEs to isolate and encrypt data and code during processing, protecting against unauthorized access, tampering, and data leakage. In OpenShift AI with NVIDIA NIM, this strengthens AI inference security by preventing prompt injection, sensitive information disclosure, data/model poisoning, and other top OWASP LLM security risks, enhancing trust in AI deployments for sensitive sectors like healthcare.
Attestation verifies the trustworthiness of the TEE hosting the workload, ensuring that both CPU and GPU environments are secure and unaltered. It is performed by the Trustee project in CoCo deployment, which validates the integrity of the confidential environment and delivers secrets securely only after successful verification, reinforcing the security of data and AI models in execution.
NVIDIA H100 GPUs with confidential computing capabilities run inside confidential virtual machines (CVMs) within the TEE. Confidential containers orchestrate workloads to ensure GPU resources are isolated and protected from unauthorized access. Attestation confirms GPU environment integrity, ensuring secure AI inferencing while maintaining high performance for computationally intensive tasks.
The deployment includes Azure public cloud with confidential VMs supporting NVIDIA H100 GPUs, OpenShift clusters for workload orchestration, OpenShift AI for AI workload lifecycle management, NVIDIA NIM for inference microservices, confidential containers for TEE isolation, and a separate attestation operator cluster running Trustee for environment verification and secret management.
By using confidential containers and attested TEEs, the platform mitigates prompt injection attacks, protects sensitive information during processing, prevents data and model poisoning, counters supply chain tampering through integrity checks, secures model intellectual property, enforces strict trusted execution policies to limit excessive agency, and controls resource consumption to prevent denial-of-service attacks.
This unified platform offers enhanced data security and privacy compliance by protecting PHI data during AI inferencing. It enables scalable deployment of AI models with trusted environments, thus facilitating sensitive healthcare AI applications. The platform reduces regulatory risks, improves operational consistency, and supports collaboration between healthcare data scientists and IT teams, advancing innovative AI-driven services securely.
Separating the attestation operator to a trusted, private OpenShift cluster ensures that the environment performing verification and secret management remains out of reach of cloud providers and potential adversaries, thereby maintaining a higher security level. This segregation strengthens the trustworthiness of TEEs running confidential workloads on public cloud infrastructure by isolating critical attestation functions.