At its AI-focused I/O event, Google announced that Google Cloud customers would be able to start using A3 virtual machines powered by NVIDIA H100 GPUs in a private preview. The search giant said that its new A3VMs were a “step forward” for customers developing advanced machine learning models.
The key features of the A3 GPU virtual machines (VMs) are as follows:
- 8 H100 GPUs utilizing NVIDIA’s Hopper architecture, delivering 3x compute throughput
- 3.6 TB/s bisectional bandwidth between A3’s 8 GPUs via NVIDIA NVSwitch and NVLink 4.0
- Next-generation 4th Gen Intel Xeon Scalable processors
- 2TB of host memory via 4800 MHz DDR5 DIMMs
- 10x greater networking bandwidth powered by our hardware-enabled IPUs, specialized inter-server GPU communication stack and NCCL optimizations
Using these virtual machines, businesses that need to train complex ML models can do so much quicker. They are built with demanding AI models in mind that are responsible for today’s generative AI.
“Google Cloud"s A3 VMs, powered by next-generation NVIDIA H100 GPUs, will accelerate training and serving of generative AI applications,” said Ian Buck, vice president of hyperscale and high performance computing at NVIDIA. “On the heels of Google Cloud’s recently launched G2 instances, we"re proud to continue our work with Google Cloud to help transform enterprises around the world with purpose-built AI infrastructure.”
According to Google, its new A3 supercomputers can provide up to 26 exaFlops of AI performance and is the first GPU instance to use custom-designed 200 Gbps IPUs with GPU-to-GPU data transfers that can bypass the CPU host. This enables ten times more network bandwidth, speeding things along.