
Ratio Shifts from 1:8 to 1:1: Undervalued CPU Becomes New AI Bottleneck
The focus of the AI computing power race is quietly shifting: as agent-based AI explodes, the CPU has evolved from a supporting role in data centers to a critical bottleneck. The CPU-to-GPU ratio is expected to tighten sharply from 1:8 to 1:1, driving a fourfold surge in computing demand. Intel and AMD are raising prices to address shortages, while NVIDIA and Arm announced their strong entry into the market in the same month, accelerating a rewrite of the competitive landscape
The focus of the AI computing power race is quietly shifting from GPUs to a long-overlooked player—the CPU.
With the explosive growth of AI agents and reinforcement learning (RL) workloads, the strategic position of CPUs in data centers is undergoing a structural revaluation. Dylan Patel, Chief Analyst at leading semiconductor research firm SemiAnalysis, stated directly in an in-depth interview on April 8, the paradigm of AI workloads is evolving from simple text generation to complex agents and reinforcement learning, and CPUs are facing "extremely severe capacity shortages."
A latest report from market research firm TrendForce confirms this judgment: the current CPU-to-GPU ratio in AI data centers is approximately 1:4 to 1:8, but in the era of agent-based AI, this ratio is expected to narrow significantly to 1:1 to 1:2.

This structural shift has triggered chain reactions on both supply and demand sides. Intel and AMD have already raised prices for certain CPU product lines by the end of the first quarter of 2026. At the same time, both NVIDIA and Arm announced their entry into the server CPU market in March 2026—a GPU giant and an IP licensor making the same choice in the same month is no coincidence, but a concentrated release of market signals.
Agents Rise: CPU Evolves from Supporting Role to Bottleneck
In the early stages of AI development, the role of the CPU was quite marginal. Dylan Patel described it as: "Light load. You send a string, it returns a string; simple reasoning requires little from the CPU." At that time, GPUs dominated AI computing demand with their massive parallel matrix operation capabilities, while CPUs only performed auxiliary functions such as compressing and routing memory data to GPUs.
However, new-generation inference models represented by OpenAI o1, along with the emerging AI agent architecture, have fundamentally changed this landscape. Unlike static large language models, agent-based AI requires dynamic interaction with the environment—planning tasks, calling tools, passing data between sub-agents, and evaluating task completion. All coordination work at this "orchestration layer" falls squarely on the CPU, making it a typical CPU-intensive workload.
An academic paper titled "A CPU-Centric Perspective on Agentic AI," released in November 2025, further quantified this pressure: in agent-based AI scenarios, latency generated by CPU tool processing (including Python interpretation, web scraping, lexical summarization, database retrieval, etc.) can account for up to 90.6% of total latency; in large-scale batch processing scenarios, CPU dynamic energy consumption can reach 44% of the system's total dynamic energy consumption.
Arm's calculations reveal the magnitude of the demand gap from a capacity perspective: traditional AI data centers require approximately 30 million CPU cores per gigawatt (GW), but in the era of agent-based AI, this demand will surge to 120 million—a fourfold increase.

Intel Under Pressure, AMD Expands on Momentum
The structural rise in CPU demand has first reshaped the landscape in the traditional x86 market.
Intel's Xeon processors had long held over 95% of the data center CPU market share. This dominance began to erode starting in 2021—yield issues with Intel's 7nm process led to a nearly two-year delay in the launch of Xeon Sapphire Rapids, opening a market gap for AMD's EPYC Milan.
In 2026, Intel plans to launch two flagship products: the Xeon 6+ (Clearwater Forest) featuring Darkmont architecture with 288 cores/288 threads and a TDP of approximately 450W; and the Xeon 7 (Diamond Rapids) using Panther Cove-X architecture with up to 256 cores/256 threads and a TDP reaching 650W. Both products are based on Intel's most advanced 18A process and introduce Foveros Direct hybrid bonding technology for the first time. However, TrendForce notes that due to persistent yield issues with the 18A process, mass production of both products may be delayed until 2027.
In contrast, AMD's pace is more steady. Its 2026 flagship product, EPYC Venice, will adopt TSMC's N2 process, Zen 6 architecture, and CoWoS-L and SoIC advanced packaging, achieving 256 cores/512 threads through simultaneous multithreading (SMT) technology—offering the highest thread count currently on the market. TrendForce expects AMD to continue eroding Intel's market share throughout 2026.
NVIDIA and Arm Enter Strongly, Rewriting Competitive Landscape
Beyond the traditional x86 duopoly, a wave of non-traditional players is entering the server CPU track at unprecedented speed, fundamentally rewriting the competitive landscape.
In March 2026, NVIDIA announced it would sell its Vera CPU as a standalone product to meet customer demand for more flexible CPU:GPU configurations. Vera uses NVIDIA's self-developed Olympus architecture, based on TSMC's N3 process and CoWoS-R packaging, offering 88 cores/176 threads and equipped with 1.8 TB/s NVLink-C2C interconnectivity, enabling memory sharing with NVIDIA GPUs. Initial partners include Alibaba, ByteDance, Cloudflare, CoreWeave, Oracle, and others. NVIDIA also launched a Vera CPU rack integrating 256 CPUs per rack, totaling 22,528 cores/45,056 threads and 400 TB of total memory.

Later that same month, Arm announced the launch of its first self-developed CPU product, the Arm AGI CPU, ending its 35-year history as a pure licensing provider. Based on TSMC's N3 process and Neoverse V3 architecture, the product offers 136 cores/136 threads, a TDP of 300W, supports DDR5-8800 memory and PCIe Gen6. Initial partners include Meta, OpenAI, Cerebras, Cloudflare, SK Telecom, and others. Arm simultaneously launched two rack configurations: an air-cooled version integrating 60 AGI CPUs (8,160 cores, approximately 180 TB memory), and a liquid-cooled version supporting 336 CPUs (45,696 cores, 1 PB memory).
Major cloud service providers (CSPs) are also accelerating their layout for self-developed CPUs. AWS released Graviton5 (192 cores/192 threads) based on TSMC's N3 process in December 2025, deploying it alongside its self-developed Trainium 3 AI ASIC to reduce AI computing costs; Microsoft launched Cobalt 200 (N3 process, 132 cores/132 threads) in November 2025; Google plans to launch the Axion C4A.metal bare-metal version and the next-generation Axion N4A in 2026, focusing on best value.
IC Backend Design Service Providers Seize Incremental Opportunities
The large-scale entry of non-traditional players is creating significant incremental business opportunities for IC backend design service providers.
TrendForce points out that AWS still insists on completing CPU backend design autonomously, while both Google and Microsoft have outsourced their CPU backend design services to Global Unichip Corp. (GUC). As more CSPs and emerging CPU manufacturers join the market, this outsourcing demand is expected to continue expanding.
TrendForce forecasts that between 2026 and 2028, ASIC design service providers such as Broadcom, Marvell, GUC, Alchip, and MediaTek will sequentially take on new projects from these clients. For market participants seeking new entry points into AI infrastructure investment, this segment may represent a structural opportunity not yet fully priced beyond the GPU boom.
