AI Training Computer Speed Test Results Released by MLPerf: NVIDIA Showcases GPT3 Training in Under 4 Minutes, Google Unveils TPU v5e

MLCommons has released the results of MLPerf 3.1, a benchmark test for training large language models (LLMs). In this round, the spotlight was on training large-scale language models. NVIDIA once again showcased its fastest training machine, the NVIDIA Eos, powered by 10,752 NVIDIA H100 chips. This machine trained GPT3 in just 3.9 minutes and achieved a 2.8x performance improvement, resulting in an impressive machine efficiency of 93%.

What makes this round special is that Azure also submitted a machine with the same specifications. However, the Azure ND machine performed only 2% slower than NVIDIA’s, proving that this size of machine is truly viable in the cloud.

On the Google side, they submitted the TPU v5e, demonstrating high-accuracy quantize techniques using INT8 processing. Despite taking 44.68 minutes to train GPT3 on a 4,096-chip TPU v5e, Google showed that the performance-to-cost ratio of the TPU v5e is significantly better. The rental cost is just $1.2 per chip per hour. With this test announcement, Google has also announced the GA availability of TPU v5e chip rental services.

Intel, on the other hand, continued to submit results using the Intel Gaudi2 chip. However, they were able to speed up the training process by using the FP8 model instead. With a total training time of 153.58 seconds on a machine with 384 Gaudi2 chips, the cost-effective performance of Gaudi2 allows organizations to purchase and use them for their operations.

TLDR: MLCommons conducted the MLPerf 3.1 benchmark test for training large language models. NVIDIA’s machine was the fastest, Azure performed slightly slower but still viable in the cloud, Google showcased an efficient TPU v5e with high accuracy, and Intel used the Gaudi2 chip with improved speed and cost-effectiveness.

AI Training Computer Speed Test Results Released by MLPerf: NVIDIA Showcases GPT3 Training in Under 4 Minutes, Google Unveils TPU v5e

More Reading

Official Termination of Overwatch League Declared by Blizzard, Culminating in a Remarkable Six Seasons of Intense Competition

Unveiling the Intrigue: Elon Musk on the Astonishing Depletion of Sharing Features on PlayStation and Xbox via X/Twitter

Leave a Comment

Leave a Reply Cancel reply

More Reading

Post navigation

Leave a Comment

Leave a Reply Cancel reply

MLCommons Unveils AI Security Testing Suite LLM for Enhanced Intelligence Safeguarding

Unveiling NVIDIA’s Latest Innovation: The Nemotron-4 340B Model for Synthetic Data Generation in LLM Training

Google Cloud Unveils New TPU v5e, Emphasizing Better Performance at a More Affordable Price Than TPU v4