Cisco launches integrated AI infrastructure ‘Pods’

Cisco has packaged its AI infrastructure for enterprises looking to purchase a certified technology stack for AI inference, which is the process of fine-tuning models for specific tasks.

Cisco introduced the AI packages, called Pods, Tuesday at its Partner Summit in Los Angeles. The company, which sells most of its products through channel partners, will begin taking orders for Pods in November.

Four Pod configurations, varying by the number of CPUs and Nvidia H100 and H200 Tensor Core GPUs, will be available. Each configuration will be offered within a Cisco UCS X-Series Modular System, which is managed and monitored through Cisco’s cloud-based Intersight software.

Many organizations use cloud providers, such as AWS, Google and Microsoft, to run AI applications. Pods are tailored AI infrastructure for organizations that want to keep data on-premises for security or compliance reasons, said Michael Leone, an analyst at TechTarget’s Enterprise Strategy Group.

Cisco’s strategy of providing pre-integrated stacks for AI applications is similar to competitors Dell, Hewlett Packard Enterprise and Lenovo.

“Everybody is trying to put together their validated stack, their pre-integrated stack, to address inferencing,” Leone said. “This is the opportunity for all traditional infrastructure vendors.”

AI model inference typically includes a technique called retrieval-augmented generation, or RAG. RAG lets enterprises use their private data to tune models for specific tasks, such as fraud detection, text summarization, personal digital assistants and medical image analysis.

Cisco’s X-Series Modular system for Pods uses a 7RU UCS X9508 chassis. UCS, or Unified Computing System, is an integrated data center or edge platform that includes computing, networking, management, virtualization and storage.

Each UCS X9508 chassis can house up to eight UCS X-Series M7 computing servers and four X440p PCIe nodes that support up to 16 Nvidia GPUs. Other chassis components include:

The UCS 9108 Intelligent Fabric Module for up to 100 Gbps of connectivity per computing node, with eight uplink ports of either 25 Gbps SFP28 or 100 Gbps QSFP28 connections;
The 1RU UCS 6536 10/25/40/100 Gigabit Ethernet, Fibre Channel over Ethernet and Fibre Channel switch, with 7.42 Tbps throughput with 36 ports. Another option is the UCS Fabric Interconnect 9108 100G Intelligent Fabric Module for enterprises using UCS blade servers within the X9508 chassis;
The UCS X9416 X-Fabric Module plug-in for the X9508 chassis. The module provides direct PCIe connections from each computing node to the GPUs.

Pods include Intersight and a subscription to the Nvidia AI Enterprise (NVAIE) software suite and the Nvidia HPC-X software toolkit. NVAIE provides tools and frameworks to fine-tune the included pre-trained models for specific tasks. The Nvidia HPC-X toolkit offers the technology needed to optimize high-performance computing applications.

Other Pod components include licensing for the Red Hat OpenShift platform to develop and deploy AI applications across hybrid cloud environments. Optional pieces include storage from NetApp or Pure Storage. Both offer toolkits for helping developers and data scientists perform data management tasks.