Facing TPU and Trainium? NVIDIA once again publishes a statement to "prove" that GB200 NVL72 can enhance the performance of open-source AI models by up to 10 times

After publicly stating "leading the industry by a generation" and privately refuting short seller views, NVIDIA recently released a technical blog post stating that its GB200 NVL72 system can enhance the performance of open-source AI models by up to 10 times. The system addresses the scalability challenges of MoE models in production environments through a collaborative design of hardware and software, effectively eliminating performance bottlenecks in traditional deployments

NVIDIA is facing challenges from competitors such as Google's TPU and Amazon's Trainium. To consolidate its dominant position in the AI chip market, the company has recently launched a series of intensive technical "self-certifications" and public responses. Following its previous rebuttal of bearish views through private letters and its public claim that its GPU technology is "a generation ahead of the industry," NVIDIA has once again released a technical blog post emphasizing that its GB200 NVL72 system can enhance the performance of top open-source AI models by up to 10 times.

On December 4th, according to media reports, NVIDIA stated that the GB200 NVL72 system can enhance the performance of top open-source AI models by as much as 10 times. The company highlighted the optimization capabilities of its server system for mixture of experts (MoE) models, which include Kimi K2 Thinking developed by the Chinese startup Dark Side of the Moon and DeepSeek's R1 model.

NVIDIA's series of technical "self-certifications" is seen as a direct response to market concerns. Earlier reports indicated that NVIDIA's key customer Meta is considering the large-scale adoption of Google's self-developed AI chip—the Tensor Processing Unit (TPU)—in its data centers. According to Wallstreetcn, Google's TPU directly challenges NVIDIA's market share of over 90% in the AI chip sector. There are concerns in the market that if large-scale customers like Meta begin to shift towards Google, it would indicate a crack in NVIDIA's seemingly impregnable moat.

NVIDIA's intensive communications have not alleviated market concerns, and the company's stock price has fallen nearly 10% in the past month.

GB200 NVL72 Technical Advantages Highlighted

NVIDIA stated in its official blog that its GB200 NVL72 system can significantly improve the performance of leading open-source AI models. The blog post elaborated on the technical advantages of the GB200 NVL72 system. This system integrates 72 NVIDIA Blackwell GPUs into a single unit, providing 1.4 exaflops of AI performance and 30TB of fast shared memory. Through NVLink Switch connections, the internal GPU communication bandwidth reaches 130TB/s.

In performance tests, the Kimi K2 Thinking, rated as the most intelligent open-source model by Artificial Analysis, achieved a 10-fold performance improvement on the GB200 NVL72 system. Other top MoE models, such as DeepSeek-R1 and Mistral Large 3, also saw significant performance improvements The Mixture of Experts (MoE) model has become a mainstream choice for cutting-edge AI models. NVIDIA pointed out that all top 10 open-source models on the Artificial Analysis leaderboard utilize the MoE architecture, including DeepSeek-R1, Kimi K2 Thinking, and Mistral Large 3. This architecture mimics the way the human brain works, activating only the specialized "expert" modules needed for specific tasks, rather than calling all model parameters, which allows MoE models to achieve faster and more efficient token generation without disproportionately increasing computational costs.

NVIDIA emphasizes that its system addresses the scalability challenges of MoE models in production environments through the collaborative design of hardware and software, effectively eliminating performance bottlenecks in traditional deployments.

Cloud Service Providers Accelerate Deployment

NVIDIA revealed that the GB200 NVL72 system is being deployed by major cloud service providers and NVIDIA cloud partners, including Amazon Web Services, Core42, CoreWeave, Crusoe, Google Cloud, Lambda, Microsoft Azure, Oracle Cloud Infrastructure, and Together AI.

Peter Salanki, co-founder and CTO of CoreWeave, stated, "At CoreWeave, our customers are leveraging our platform to put expert mixture models into production. Through close collaboration with NVIDIA, we are able to provide a tightly integrated platform."

Lin Qiao, co-founder and CEO of Fireworks AI, pointed out, "The rack-scale design of the NVIDIA GB200 NVL72 significantly enhances the service efficiency of MoE models, setting new benchmarks for performance and efficiency in large-scale MoE model services." It is reported that the company has currently deployed the Kimi K2 model on the NVIDIA B200 platform, achieving the highest performance on the Artificial Analysis leaderboard