About the Role
Infrastructure Architect - AI & Datacenter
Location: RemotePosted On: 05/05/2026
Requirement Code: 73652
Requirement Detail
Role Overview
We are looking for a Principal Infrastructure Architect to join our IT PMO organization to take responsibility and lead the design, orchestration, and lifecycle management of our next-generation GPU Farm and AI Factory environments. This role is unique in its breadth, requiring a deep understanding of high-performance AI compute stacks alongside the disciplined management of physical data center assets and their long-term operational health. You will bridge the gap between R&D engineering requirements and the physical realities of global data center operations.
Key Responsibilities
1. AI & GPU Infrastructure Design (GPU Farm / AI Factory)
• Lead the architectural design and refinement of the Nutanix GPU-as-a-Service (GPUaaS) platform, ensuring a seamless experience for internal R&D, QA, and Sales teams.
• Provide technical leadership in some of the key initiatives such as Nutanix Validated Designs (NVD) for the AI Factory, incorporating NVIDIA MGX/HGX architectures and high-density Cisco nodes (e.g., UCS 845A).
• Architect the Management Cluster control plane (NKP, Prism Central, NuDeploy) to ensure it is decoupled from GPU compute nodes for maximum efficiency.
• Implement policy-driven placement of workloads across on-prem and cloud-burst environments.
2. Data Center Asset & Lifecycle Management
• Design solution for a centralized Data Center Asset Inventory system, ensuring real-time visibility into all hardware assets, including CPUs, GPUs, Virtual Machines, and networking.
• Develop a comprehensive Hardware Lifecycle Management strategy, including procurement forecasting, "rack and stack" operationalization, and decommissioning of legacy systems (G3/G4/G5).
• Lead "Tiger Team" initiatives to navigate supply chain constraints, ensuring critical release milestones are not delayed by hardware shortages.
• Enforce strict Security Standards for Data Center HW Provisioning.
• Implement network segmentation for all the critical applications.
• Ensure all infrastructure meets SOC 2 and ISO 27001 compliance objectives while maintaining low-latency performance.
3. Special Projects
• Provide required architecture and designs during the project intake process. Review, guide the teams for right architecture for all demands before they become approved projects.
• Partner with security team and provide guidelines for upcoming projects.
• Involve and lead projects as an architect on special projects.
Preferred Qualifications
• Experience managing (as an architect) massive-scale data center environments (1,000+ nodes).
• Knowledge of Nutanix Cloud Infrastructure (NCI), AHV, and Prism Central
• Strong background in MLOps and automated pipeline integration (Kubeflow/MLflow).