Nvidia has long been recognized as a leader in the AI accelerator space, but according to recent research by Databricks, Intel’s Gaudi 2 technology is giving Nvidia a run for its money. The study shows that Gaudi 2 provides robust performance competition against Nvidia’s industry-leading AI accelerators, particularly in large language model (LLM) inference. This research is a significant validation of Intel’s Gaudi technology, which has been steadily improving since Intel’s acquisition of Habana Labs and its Gaudi technology in 2019. In this article, we will examine the key findings of the research and explore how Intel’s Gaudi 2 technology stacks up against Nvidia’s AI accelerators.
The Databricks research found that Intel Gaudi 2 delivers comparable latency to Nvidia’s H100 systems on decoding and outperforms Nvidia’s A100 in LLM inference. Additionally, Gaudi 2 achieves higher memory bandwidth utilization than both H100 and A100. However, Nvidia still holds the edge in terms of training performance on its top-end accelerators. Using the Databricks MosaicML LLM foundry for training, Gaudi 2 comes in second in terms of single-node LLM training performance, surpassing 260 TFLOPS/chip. Despite this, the research concludes that Gaudi 2 offers the best dollar-per-performance for both training and inference, based on public cloud pricing, when compared to A100 and H100.
The Databricks performance numbers are consistent with Intel’s own testing results on Gaudi 2 via the MLcommons MLperf benchmark for training and inference. Eitan Medina, the COO at Habana Labs, an Intel company, affirms that the report provides validation for Intel and highlights the performance of Gaudi 2 to potential customers. Medina emphasizes that Gaudi is a viable alternative to Nvidia’s AI accelerators and hopes that more customers will become aware of its capabilities.
Looking ahead, Intel is preparing to launch Gaudi 3 AI accelerator technology in 2024. Gaudi 3, developed with a 5 nanometer process, will offer four times the processing power and double the network bandwidth of Gaudi 2. This translates to significant improvements in performance per dollar and performance per watt. Medina states that Gaudi 3 will be launched and in mass production in 2024, marking a major leap in performance for Intel.
While benchmarks like MLPerf and the Databricks report are valuable indicators of performance, Medina acknowledges that many customers rely on their own testing to ensure that the hardware and software stack is suitable for their specific models and use cases. The maturity of the software stack is crucial in gaining customer trust, as benchmarking organizations may optimize results to meet specific benchmarks. MLPerf, however, provides an essential credibility filter for organizations before investing time in testing. Although MLPerf results are not the sole basis for making business decisions, they serve as an initial assessment of a technology stack’s capabilities.
Intel remains committed to its CPU technologies for AI inference workloads and continues to see their value. The company recently announced its 5th Gen Xeon processors with AI acceleration, recognizing the significant role of CPUs in inference tasks. Medina suggests that CPUs still have a significant percentage of inference workloads, and even fine-tuning can prove advantageous on CPUs.
Looking further into the future, Intel is working on future generations that will merge its high-performance computing (HPC) and AI accelerator technology. This convergence opens up exciting possibilities for Intel as it continues to innovate and push the boundaries of AI acceleration and performance.
The research conducted by Databricks provides strong evidence that Intel’s Gaudi 2 technology competes strongly with Nvidia’s AI accelerators. Gaudi 2 delivers excellent performance in LLM inference, matching the latency of Nvidia’s H100 systems and outperforming the A100. Although Nvidia still leads in training performance, Gaudi 2 offers the best dollar-per-performance for both training and inference. The validation from the Databricks research reinforces Intel’s position as a formidable competitor in the AI accelerator space. With the upcoming launch of Gaudi 3 in 2024, Intel is poised to further boost its performance and solidify its position as a leader in AI acceleration.