NVIDIA teams up with Microsoft to build massive cloud AI computing. Tens of thousands of NVIDIA GPUs, NVIDIA Quantum-2 InfiniBand, and full stack of NVIDIA AI software are coming to the Azure cloud. NVIDIA, Microsoft and global enterprises will then use this platform for rapid, cost-effective AI development and deployment.
NVIDIA announces a multi-year partnership with Microsoft to build one of the most powerful AI supercomputers in the world, powered by Microsoft Azure’s advanced supercomputing infrastructure combined with NVIDIA GPUs, networking and a full stack of AI software to help companies to train, implement and scale AI, including large, advanced models.
Azure’s cloud-based AI supercomputer includes powerful and scalable ND and NC series virtual machines optimized for AI distributed training and inference. It’s the first public cloud to include NVIDIA’s advanced AI stack, enabling tens of thousands NVIDIA A100- and H100 – GPUs, NVIDIA Quantum-2 400 Gb/s InfiniBand network and NVIDIA AI Enterprise software package to be added to its platform.
As part of the collaboration, NVIDIA will use Azure’s scalable virtual machine instances to explore and further accelerate advances in generative AI, a rapidly growing area of AI where fundamental models such as Megatron Turing NLG 530B form the basis for unsupervised self-learning algorithms to create new text, code, digital images, video or sound.
The companies will also collaborate on DeepSpeed deep learning optimization software from Microsoft. NVIDIA’s full stack of AI workflows and software development kits, optimized for Azure, are being made available to Azure enterprise customers.
“Advances in AI technology and industry adoption are accelerating. The breakthrough of fundamental models has sparked a flood of research, fueling new start-ups and enabling new business applications,” said Manuvir Das, vice president of enterprise computing at NVIDIA. “Our partnership with Microsoft will provide researchers and businesses with the latest AI infrastructure and software to take advantage of the transformative power of AI.”
“AI is fueling the next wave of automation in enterprise and industrial computing, enabling organizations to do more with less as they navigate economic uncertainties,” said Scott Guthrie, executive vice president of Cloud + AI Group at Microsoft. “Our partnership with NVIDIA unlocks the world’s most scalable supercomputing platform, bringing state-of-the-art AI capabilities to every enterprise on Microsoft Azure.”
Scalable peak performance with NVIDIA Compute and Quantum-2 InfiniBand on Azure
Microsoft Azure’s AI-optimized virtual machine instances are designed with NVIDIA’s most advanced data center GPUs and are the first public cloud instances to include NVIDIA Quantum-2 400Gb/s InfiniBand networking. Customers can deploy thousands of GPUs in a single cluster to train even the most massive large language models, build the most complex recommender systems at scale, and enable generative AI at scale.
Current Azure instance function NVIDIA Quantum 200Gb/s InfiniBand network with NVIDIA A100 GPUs. Future ones will be integrated with NVIDIA Quantum-2 400Gb/s InfiniBand networks and NVIDIA H100 GPUs. Combined with Azure’s advanced cloud infrastructure, networking and storage, these AI-optimized offerings deliver scalable peak performance for AI training and deep learning inference workloads of all sizes.
Accelerate AI development and deployment
In addition, the platform supports a wide range of AI applications and services, including Microsoft DeepSpeed and the NVIDIA AI Enterprise software suite.
Microsoft DeepSpeedwill use NVIDIA H100 Transformer Engine to accelerate transformer-based models used for large language models, generative AI and writing computer code, among other things. This technology applies 8-bit floating precision capabilities to DeepSpeed to dramatically accelerate AI calculations for transformers – with twice the throughput of 16-bit operations.
NVIDIA AI Enterprise – the globally deployed software for the NVIDIA AI platform – is certified and supported on Microsoft Azure instances with NVIDIA A100 GPUs. Support for Azure instances with NVIDIA H100 GPUs will be added in a future software release.
NVIDIA AI Enterprise, which includes NVIDIA Riva for voice AI and NVIDIA Morpheus cybersecurity application frameworks, streamlines every step of the AI workflow, from data processing and AI model training to simulation and large-scale deployment.