Confidential Containers for GPU Compute: Incorporating LLMs in a Lift-and-Shift Strategy for AI - Zvonko Kaiser, NVIDIA
Abstract
This proposal discusses the evolution of confidential containers, integrating them with the GPU cloud-native stack for AI/ML workloads. We explore the transition from traditional to secure, isolated environments crucial for sensitive data processing. Our choice of Kata for confidential container enablement ensures security while maintaining container flexibility. Alongside this, a virtualization reference architecture supports advanced scenarios like GPUdirect RDMA. A key aspect of our strategy is the lift-and-shift approach, allowing seamless migration of existing AI/ML workloads to these confidential environments. This integration combines LLMs with GPU-accelerated computing - leveraging Kubernetes for effective orchestration and striking a balance between computational power and data privacy.