AI: Nvidia unveils its new Hopper architecture, the successor to Ampere


Nvidia unveiled Hopper, its new architecture for AI workloads in data centers. This will succeed Ampere and takes its name from computer pioneer Grace Hopper The first Hopper-based product will be the H100, which contains 80 billion transistors, is built on TSMC’s 4N process and offers three-to-one performance. six times that of the Ampere-based A100. The GPU will support PCIe Gen5, fourth-generation NVLink HBM3 memory, and feature 3Tbps memory bandwidth.

“Twenty H100 GPUs can support the equivalent of global internet traffic, enabling customers to deliver advanced recommender systems and large language models that perform real-time data inference,” Nvidia management argued. during Hopper’s presentation. The GPU will also feature the second generation of multi-instance technology and will be able to support seven tents on a single GPU. The company also claims that its GPU will be able to do this securely, thanks to the use of confidential computing.

“Hopper brings Confidential Computing to Accelerated Computing using a combination of hardware and software. When Confidential Computing is enabled or the Trusted Execution Environment is created through a Confidential Virtual Machine which encompasses both CPU and GPU, data transfers between CPU and GPU, and between GPU and GPU in a node are encrypted and decrypted at the maximum PCIe line speed,” says Paresh Kharya, Senior Director of Data Center Computing at Nvidia.

A GPU launched in the third quarter

“The H100 also has a hardware firewall that secures the entire workload on the H100 GPU, and also isolates it between memory and computing engines, so no one but the owner of the trusted execution environment with the key cannot touch the data encoded inside.” This design ensures complete isolation of the VM and prevents access or modification by any unauthorized entity, including the hypervisor, the host operating system, or even anyone with physical access.”

Nvidia claims that the H100 could process the 105-layer, 530 billion-parameter monster model, the Megatron-Turing 530B, with up to 30 times the throughput. When training the Mixture of Experts Transformer model with 395 billion parameters on 8,000 GPUs, Nvidia management indicated that it will only take 20 hours, compared to the seven days required with the A100s. To achieve this, the company has bundled eight H100 GPUs for its DGX H100 system which will deliver 32 petaflops on FP8 workloads, while its new DGX Superpod will link up to 32 DGX H100 nodes with a switch using fourth generation NVLink. capable of 900GBps.

For those on a more modest budget, the GPU will be available from the usual cloud service providers. The H100 will be launched starting in the third quarter of 2022. Alongside the H100, Nvidia also unveiled Grace Superchip, based on a pair of Grace chips connected using NVLink-C2C for interconnection between the chips. The superchip has 144 Arm cores in a single socket, LPDDR5x memory with ECC, and draws 500 watts. This Grace superchip, as well as the CPU-GPU pair of the Grace Hopper superchip announced last year, will be available in the first half of 2023.

Source: ZDNet.com





Source link -97