[In preview] Public Preview: Azure Container Apps serverless GPUs now support Azure AI Foundry models
Overview of Azure Container Apps Serverless GPUs with Azure AI Foundry Models
Azure has announced a public preview for Azure Container Apps serverless GPUs, which now support Azure AI Foundry models. This integration allows users to deploy ready-to-use AI models directly during container app creation, providing flexibility and efficiency in AI application development.
Key Features
- Azure AI Foundry Models Support: Azure Container Apps now supports deploying Azure AI Foundry models, allowing users to integrate these models during app creation for efficient AI workflows[1][3].
- Serverless GPUs: These GPUs enable automatic scaling, optimized cold start, and per-second billing. Users only pay for the GPU compute they use, and the feature scales down to zero when not in use[2][4].
- Benefits for AI Development: Serverless GPUs accelerate AI development by focusing on core AI code rather than infrastructure management. This feature provides a middle layer between Azure AI Model Catalog's serverless APIs and custom models on managed compute[2][4].
- Data Governance: Data never leaves the container boundary, ensuring full data governance and security while providing a managed serverless platform[2][4].
How It Works
- Access to GPUs: Users must request GPU quotas through a customer support case to access serverless GPUs[2][5].
- Workload Profiles: Serverless GPUs are supported only for Consumption workload profiles, not for Consumption-only environments[2][5].
- NVIDIA GPUs: Users can choose between NVIDIA A100 and T4 GPUs for their applications[2].
Recent Developments
- General Availability: Azure Container Apps Serverless GPUs are now generally available, providing a robust platform for AI workloads with automatic scaling and per-second billing[4].
- NVIDIA NIM Integration: Supports NVIDIA NIM microservices for secure and scalable AI model inferencing[4].
- Dynamic Sessions: Early access is available for Serverless GPU in Dynamic Sessions, enabling running untrusted AI-generated code within compute sandboxes protected by Hyper-V isolation[3].
ความคิดเห็น
แสดงความคิดเห็น