As interest in AI soars, security leaders are prioritizing an architecture framework that supports innovation and delivers end-to-end protection of sensitive data and models—all while mitigating data exfiltration, poisoning, and other nefarious use case risks.
Inadvertent leaks of AI models trained on PII data, users sharing sensitive information via genAI prompts, and use of AI to create deepfakes or generate exploits are just some of the nightmare scenarios security leaders are up against as they architect a security infrastructure primed for the emergent AI era.
Building a platform that delivers end-to-end protections for AI workloads is a tall order. Microsoft, in an ongoing partnership with NVIDIA, has engineered a solution by bringing confidential computing to the Azure cloud with the advent of an industry-first: Azure Confidential VMs with NVIDIA H100 Tensor Core GPUs.
As defined by the Confidential Computing Consortium, a group dedicated to accelerating the adoption of technologies and standards in this space, confidential computing protects data in use by executing computation in a hardware-based and attested Trusted Execution Environment (TEE). This creates a secure and isolated space that can help ensure applications and data in use are impervious to memory hacks, even those that can exploit a potential flaw in the host operating system or the hypervisor. Confidential computing is particularly relevant for AI workloads as models process sensitive data and the models themselves have high value. Companies in highly regulated sectors like government, finance, and healthcare need assurance that models and associated data are not accessible to unauthorized third parties as well as cloud operators while in use in the cloud.
“The Azure confidential VMs with NVIDIA H100 GPUs bring a complete, highly secure computing stack from the VMs to the GPU architecture, enabling developers to build and deploy AI applications with assurances that their critical data, intellectual property, and AI models remain protected end to end”, says Vikas Bhatia, Head of Product for Azure confidential computing at Microsoft.
“This offers Azure customers more options and flexibility to securely run their workloads involving sensitive data on the cloud to meet privacy and regulatory concerns,” Bhatia says.
Confidential computing with GPUs at work
While encryption has been used to protect data at rest on disk and data in transit in the network, Azure confidential VMs with NVIDIA H100 GPUs deliver a third pillar of protection, securing data and models while they’re in use in memory through a TEE that spans the CPU and GPU. While in the TEE, workloads are protected from privileged attackers, including administrators and entities with physical access. All application code, models, and data remain protected at all times.
Currently showcased in a gated preview, the two-step attestation process starts with VM attestation where the guest attestation agent gathers crucial evidence and endorsements. This includes TCG logs that capture boot measurements and boot cookies, SNP reports with TPM Attestation Keys (AK), AMD VCEK certificate chains that sign the SNP report, PCR value quotes signed by AK, and freshness nonce values also signed by AK.
The second step is GPU Attestation. A GPU attestation verifier can be run locally inside the confidential VM to verify the signed GPU report, or verification can be done remotely leveraging the NVIDIA Remote Attestation Services (NRAS). Both of these can compare the reported firmware measurements against the Reference Integrity Manifests (RIMs) signed and published by NVIDIA. The attestation agent also checks revocation status of the GPU certification chain and RIMs signing certificating chain.
Microsoft and NVIDIA are working together to improve this experience, with more seamless CPU and GPU attestation capabilities rolled out in subsequent phases. Their goal: Providing confidence that data is secure throughout the entire AI lifecycle, from protection of models and prompt data from unauthorized access during inferencing or training to safeguarding prompts and resulting responses.
To learn more and get started with these new GPU-enabled confidential VMs, visit us here.