
Meta aims to transform AI capital expenditures into a "computing power balance sheet"! From the AI arms race to entering cloud computing, Meta wants to fully capitalize on the dividends of the AI inference era
Meta is formulating an unprecedented cloud computing infrastructure business plan aimed at transforming its internal AI computing power assets into a platform for external leasing. By selling surplus AI computing power and exclusive access to large model resources to customers, Meta seeks to compete with giants like AWS and Azure, turning substantial capital expenditures into an "AI computing power balance sheet" with revenue resilience. This move aims to capture the dividends of the AI inference era and address the growing demand for cloud computing power
According to Zhitong Finance APP, Meta Platforms Inc. (META.US), the parent company of Facebook and Instagram, is formulating an unprecedented cloud computing infrastructure business plan. This business will sell access to Meta's large-scale AI computing infrastructure and exclusive AI large model resources, thereby creating a new competitive dimension against cloud computing industry leaders such as Amazon Web Services (AWS), Microsoft Azure, and Google Cloud.
According to the latest reports from media sources citing informed individuals, Meta has been rapidly securing expensive AI data center construction processes and massive underlying AI computing infrastructure resources such as AI GPUs and data center CPUs to support its AI ambitions. At the same time, the company is establishing a significant new business to generate AI-related revenue by selling surplus AI computing resources to external customers. The informed sources requested anonymity due to the confidentiality of related details.
The core logic behind Meta's entry into cloud computing is that it focuses on transforming the previously questioned massive AI capital expenditures into an "AI computing balance sheet" with external revenue elasticity, amid a global surge in demand for AI computing resources. Meta is building a cloud business to sell surplus AI computing power, while recent reports of Google restricting Meta's use of the Gemini model further prove that the demand for AI computing power is so strong that even large tech companies face shortages of cloud AI computing resources. Although Google Cloud's revenue reached $20 billion in the first quarter, computing constraints still limit higher growth and increase backlog orders.
This also means that Meta's strategy is not simply "crossing into the cloud," but rather striving to be the first to develop AI superintelligence, advertising recommendations, video generation, AI assistants, and model training/cloud large-scale AI inference by hoarding GPU/ASIC/TPU resources, network infrastructure, HBM/DRAM/NAND storage components, power, and data center capacity; once there is a temporary surplus, this capacity will be packaged into model APIs or raw computing power leasing, benchmarking against AWS Bedrock or CoreWeave-style new cloud leaders, thus providing downside protection for "overbuilt AI computing infrastructure" and offering investors a path to recover AI capital expenditures.
On a deeper level, Meta is betting on the long-term scarcity of computing resources as AI workloads shift from training centers to inference centers. McKinsey estimates that by 2030, global data centers will require approximately $6.7 trillion in investment to meet computing demands, with capital expenditures for AI inference-related data centers amounting to about $5.2 trillion; the International Energy Agency predicts that global data center electricity consumption will double to approximately 945 terawatt-hours by 2030, with AI-driven accelerated servers experiencing an annual growth rate of about 30%; Wall Street financial giant Goldman Sachs also predicts that electricity demand for U.S. data centers will rise from 31 gigawatts in 2025 to 66 gigawatts in 2027 Therefore, Meta's move ultimately focuses on the most scarce "usable reasoning capacity" in the AI era: GPU/ASIC/TPU AI computing resource clusters, low-latency networks, model hosting, token billing, and even the high-efficiency scheduling capabilities of ultra-large-scale data centers.
Wall Street's valuation anchor for Meta may shift from "advertising platform + metaverse spending discount" to "advertising cash cow + AI model entry + computing infrastructure options"; however, the key to success lies not in whether chips can be purchased, but in whether the company can complete its sales, AI developer ecosystem, reliability of massive cloud AI computing resource leasing services, cloud platform software stack, and unit token economics.
From the AI superintelligence arms race to challenging AWS, Azure, and Google Cloud, Meta strives to turn "excess computing power" into a cloud business flywheel.
It is understood that an important potential plan led by Meta includes selling access to various AI models hosted on Meta's existing AI computing infrastructure; insiders say this approach is similar to Amazon Web Services' Bedrock product. Meta will operate the data center components that support these models and other segmented computing resources such as AI chips, including its own latest developed Muse Spark model, and charge developers access fees.
Insiders say, the company is also considering selling access to "raw" cloud-native computing capacity, similar to so-called new cloud business leaders like CoreWeave Inc. The development of these new business lines is part of Meta Compute; Meta Compute is an internal initiative aimed at building and managing the company's AI computing infrastructure deployment efforts. Meta Compute is led by Meta's infrastructure head Santosh Janardhan, internal leaders of the Meta Superintelligence Laboratory AI department Daniel Gross, and Meta President Dina Powell McCormick.
A Meta spokesperson declined to comment. The company's plans are still being formulated, and related strategies may change. Meta's stock price rose nearly 10% during early trading on Wednesday, before giving back some of the gains. CoreWeave's stock price, on the other hand, plummeted over 10% in early trading.
Meta has listed the development of AI "superintelligence" as one of its top priorities and has committed to investing hundreds of billions of dollars in building data centers and other critical AI computing infrastructure, such as the expensive AI chips it deems necessary to achieve this goal and a range of AI computing-related hardware systems. This investment has already made investors anxious about how Meta will generate returns from these massive expenditures in the hundreds of billions, which include significant computing deals with cloud computing leaders like CoreWeave, Alphabet Inc.'s Google, and Oracle Corp. The cloud computing business indeed provides an important way to recoup some of the investments in AI computing infrastructure. Amazon Web Services, Azure, and Google Cloud have spent decades building super cloud computing platforms, renting out computing power, massive storage capacity on the enterprise side, and access to certain large software through internet platform ecosystems—these businesses now generate hundreds of billions of dollars in revenue each quarter.
With the surge in AI demand, these service providers have also expanded to rent out dedicated AI chips and extensive AI computing infrastructure resources and capacities needed for training and running AI models. This is a complex business that requires not only large clusters of data centers but also a vast cloud computing ecosystem formed by software platforms, enterprise sales teams, and customer support operations.
Elon Musk's SpaceX recently became a key player in the cloud computing resource leasing space after acquiring its AI startup xAI in February; earlier this year, the company leased cloud access to its large AI data center's underlying computing resources in Memphis to AI application leader Anthropic PBC and struck a cloud AI computing leasing deal with Google. According to a forecast by Bloomberg Intelligence, this strategy could help xAI generate over $50 billion in revenue by 2028 and at least $100 billion by 2030.
Despite the complexities, Meta CEO Mark Zuckerberg has indicated to investors that he is willing to sell excess computing infrastructure and is even open to launching a so-called large model API service, allowing customers to pay for AI usage—this type of business is typically measured in "token scale," which refers to the massive amount of data used and generated by customer queries.
During a call with shareholders in May, Zuckerberg stated, “This is definitely on the table. Almost every week, different external companies come to us, asking us to build specialized API service systems and inquiring if we have computing power they can purchase from us, and they are willing to pay a premium above our acquisition cost.”
Zuckerberg noted at the time, “We haven't done this yet because we believe we have uses for this computing power. But clearly, if we reach a point where we think we have overbuilt, then this is an option we have, which also gives us more confidence in investing in building this infrastructure.”
In the rapidly evolving arms race for AI computing infrastructure, Zuckerberg has repeatedly hinted that he believes the industry is constrained by AI computing capacity, and Meta should accumulate as much computing power as possible before deciding its uses.
Behind Meta's Transformation of AI Cloud into a "Computing Power Recycling Flywheel": The Explosion of Agent Runtime, Semiconductor Gaps, and Resonating Demand for Cloud AI Inference Computing Power
The core logic behind Meta's establishment of its AI cloud business is to strive to transform the enormous ASIC/TPU/GPU, data center CPUs, data center power chains, optical interconnect networks, and expensive computing assets generated by the "superintelligence" arms race from a purely internal cost center into an externally rentable AI computing infrastructure platform For the capital market, this is equivalent to equipping Meta's hundreds of billions of dollars in AI capital expenditures with a "residual value recovery mechanism": when in use, it serves advertising recommendations, video generation, AI assistants, and super-intelligent training; when idle, it transforms into a vast revenue category of APIs, model hosting, inference clusters, or GPU leasing, thereby alleviating market anxiety over the return on AI capital expenditures.
The increasing leverage and crowded positions in the AI semiconductor trading theme, along with the rising price pressures from consumer electronics leaders like Apple, have been accompanied by the Philadelphia Semiconductor Index experiencing a single-day drop of 7.9%, and multiple instances of over 5% volatility within a month, highlighting that the AI computing power industry chain associated with semiconductors has entered a phase of high volatility, extreme leverage, and crowded bullish positions, with significant pressure to realize high expectations. This is why many institutional investors have recently begun to emphasize overly pessimistic bearish narratives such as "the AI semiconductor trading frenzy has bottomed out" and "the AI bubble is gradually bursting."
However, the well-known Wall Street investment firm Nomura released its latest research report on Wednesday, refuting the "semiconductor peak theory." The key to Nomura's rebuttal of the "semiconductor peak theory" is not simply stating that AI chips will continue to rise, but rather pointing out that the demand for AI cloud infrastructure is spreading from a shortage of single-point GPUs to a systemic mismatch of components. According to Nomura's research framework, AI server revenue is expected to grow by 78% and 76% in 2026 and 2027, respectively, with the number of global data center projects increasing from 240 to 280, including about 50 gigawatt-level projects. The newly deployed computing power is expected to reach 32GW by 2027, with 23GW visibility already in 2028; however, the real bottleneck is shifting from GPU capacity and TSMC's CoWoS advanced packaging to wafer-level substrates, AI PCBs, copper-clad laminates (CCL), electronic fabrics, MLCCs, glass substrates/ABF substrates, IC substrates, high-end capacitors, power management chips, and high-speed optical interconnect components for data centers.
It is reported that McKinsey's medium to long-term estimates also support Nomura's emphasis on this direction: by 2030, to meet AI-related demand, the global computing power value chain will need to invest approximately $5.2 trillion in data centers, corresponding to about 156GW of AI-related data center capacity demand. This means that the main line of semiconductor trading is not "peaking," but rather "rotation of shortage positions": from GPUs to HBM, from advanced packaging to substrate materials, and then to power, liquid cooling, networking, and cloud scheduling software, with upward revisions in profits and price increase expectations still likely to be the strongest catalysts for the core hardware chain related to AI computing power.
The globally popular "Agent Runtime" theory further explains why the demand for cloud AI inference may be close to endless. Traditional model hosting is a stateless service of "input - inference - output," while agent AI is a cyclical workflow: planning, calling models, calling tools, observing, retrying, correcting errors, until the task is completed. The latest market use case is crucial: even with a single call success rate of 95%, the task success rate after 15 consecutive calls is only about 46.3%, which means that what enterprises are truly purchasing is not cheap tokens, but a runtime system that is orchestratable, observable, traceable, controllable in cost, and auditable in results Nebius emphasizes the transition from over 200MW of operational computing power to 800MW to 1GW by the end of the year, while securing more than 3GW of reserved capacity. This reflects an upgrade in cloud infrastructure from "selling GPU hours" to "selling result certainty." The International Energy Agency also predicts that global data center electricity consumption will double to approximately 945TWh by 2030, indicating that Agent Runtime is not merely a change in software architecture but will drive full-stack investments in GPU clusters, power capacity, network switching, storage retrieval, model routing, and observability platforms.
Meta's entry into cloud computing can be seen as a significant explosion of the Agent Runtime flywheel, resonating with the semiconductor gap and the rising demand for cloud AI inference computing power. On the demand side, the agent task cycle amplifies token and inference consumption, while on the supply side, advanced packaging and small component bottlenecks limit AI server delivery. On the commercial side, Meta, CoreWeave, Nebius, AWS, Azure, and Google Cloud are competing for the monetization entry point of "available AI computing power." AI cloud computing is no longer just a narrative of traditional cloud vendors' IaaS CPU and scheduling software expansion; it has entered a composite cycle of "computing power asset securitization + inference economics + semiconductor price increase chain." The most beneficial assets are often not a single GPU narrative but segments that simultaneously possess supply bottlenecks, pricing power, and visible cash flow
