Microsoft’s new H200 v5 series VMs for Azure aim to supercharge GPU performance


Microsoft has announced the launch of new Azure virtual machines (VMs) aimed specifically at ramping up cloud-based AI supercomputing capabilities.
The new H200 v5 series VMs are now generally available for Azure customers and will enable enterprises to contend with increasingly cumbersome AI workload demands.
Harnessing the new VM series, users can supercharge foundation model training and inferencing capabilities, the tech giant revealed.
Scale, efficiency and performance
In a blog post, Microsoft said the new VM series is already being put to use by a raft of customers and partners to drive AI capabilities.
“The scale, efficiency, and enhanced performance of our ND H200 v5 VMs are already driving adoption from customers and Microsoft AI services, such as Azure Machine Learning and Azure OpenAI Service,” the company said.
Among these is OpenAI, according to Trevor Cai, OpenAI’s head of infrastructure, which is harnessing the new VM series to drive research and development and fine-tune ChatGPT for users.
“We’re excited to adopt Azure’s new H200 VMs,” he said. “We’ve seen that H200 offers improved performance with minimal porting effort, we are looking forward to using these VMs to accelerate our research, improve the ChatGPT experience, and further our mission.”
Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!
Under the hood of the H200 v5 series
Azure H200 v5 VMS are architected with Microsoft’s systems approach to “enhance efficiency and performance,” the company said, and include eight Nvidia H200 Tensor Core GPUs.
Microsoft said this addresses a growing ‘gap’ for enterprise users with regard to compute power.
With GPUs growing in raw computational capabilities at a faster rate than attached memory and memory bandwidth, this has created a bottleneck for AI inferencing and model training, the tech giant said.
“The Azure ND H200 v5 series VMs deliver a 76% increase in High Bandwidth Memory (HBM) to 141GB and a 43% increase in HBM Bandwidth to 4.8 TB/s over the previous generation of Azure ND H100 v5 VMs,” Microsoft said in its announcement.
“This increase in HBM bandwidth enables GPUs to access model parameters faster, helping reduce overall application latency, which is a critical metric for real-time applications such as interactive agents.”
Additionally, the new VM series can also compensate for more complex large language models (LLMs) within the memory of a single machine, the company said. This thereby improves performance and enables users to avoid costly overheads when running distributed applications over multiple VMs.
Better management of GPU memory for model weights and batch sizes are also a key differentiator for the new VM series, Microsoft believes.
Current GPU memory limitations all have a direct impact on throughput and latency for LLM-based inference workloads, and create additional costs for enterprises.
By drawing upon a larger HBM capacity, the H200 v5 VMs are capable of supporting larger batch sizes, which Microsoft said drastically improves GPU utilization and throughput compared to previous iterations.
“In early tests, we observed up to 35% throughput increase with ND H200 v5 VMs compared to the ND H100 v5 series for inference workloads running the LLAMA 3.1 405B model (with world size 8, input length 128, output length 8, and maximum batch sizes – 32 for H100 and 96 for H200),” the company said.
More from TechRadar Pro
Microsoft has announced the launch of new Azure virtual machines (VMs) aimed specifically at ramping up cloud-based AI supercomputing capabilities. The new H200 v5 series VMs are now generally available for Azure customers and will enable enterprises to contend with increasingly cumbersome AI workload demands. Harnessing the new VM series,…
Recent Posts
- Reddit is experiencing outages again
- OpenAI confirms 400 million weekly ChatGPT users – here’s 5 great ways to use the world’s most popular AI chatbot
- Elon Musk’s AI said he and Trump deserve the death penalty
- The GSA is shutting down its EV chargers, calling them ‘not mission critical’
- Lenovo is going all out with yet another funky laptop design: this time, it’s a business notebook with a foldable OLED screen
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010