Xiaomi's first full-modal large model debuts: "Human, Vehicle, Home" the final piece in multi-line operations?

In December 2025, Xiaomi released its self-developed AI large model Xiaomi MiMo-V2-F at the Human-Vehicle-Home Ecosystem Partner Conference

After Xiaomi released its self-developed AI large model Xiaomi MiMo-V2-Flash at the Human-Vehicle-Home Ecosystem Partner Conference in December 2025, Xiaomi has accelerated its efforts again.

On March 19, Xiaomi launched its first full-modal base model, Xiaomi MiMo-V2-Omni.

MiMo-V2-Omni is designed as an "executor" with cross-modal perception and GUI (Graphical User Interface) operation capabilities, allowing seamless integration with various Agent frameworks.

Previously, this model was blind-tested on the OpenRouter platform under the codename "Healer Alpha" and demonstrated performance that matches or even surpasses leading closed-source models in various benchmark tests.

Regarding the model's efficient "new product launch speed," Lei Jun stated, "We have been relatively low-key in the AI field, and our actual progress may be much faster than what everyone sees. This year, our R&D and capital investment in the AI field will exceed 16 billion yuan. I believe that as long as we continue to invest, Xiaomi will deliver an impressive answer sheet in the AI era."

As the core leader of this model, Luo Fuli also candidly stated on overseas social media, "Anyone in the MiMo team who has conducted fewer than 100 dialogue tests before tomorrow can leave directly. This tactic worked. Once the team's imagination is ignited by the capabilities of the agent system, that imagination directly translates into R&D speed."

Currently, Xiaomi has provided API pricing of $0.4 per million tokens for input and $2 per million tokens for output (supporting 256K context).

Xiaomi's ambitions clearly extend beyond selling APIs to developers.

The model has currently partnered with Kingsoft Office (WPS) to explore scenarios for text generation and structured data processing.

However, from a strategic depth perspective, the commercial endgame of MiMo-V2-Omni points to Xiaomi's "Human-Vehicle-Home Ecosystem."

In its future vision for MiMo-V2-Omni, Xiaomi also stated that it will "continue to promote long-term intelligent agent planning, real-time streaming perception, multi-agent collaboration, and deeper integration with the physical world."

If this model can be deeply integrated as the underlying "brain" into Xiaomi's Surge OS (HyperOS), truly creating an AI base that can deeply understand voice commands across devices, autonomously call mobile apps, and even control Xiaomi's vehicle interface, it will greatly enhance Xiaomi's hardware premium capability and user retention rate.

Despite the attractive technical demonstrations and ecological vision, Xiaomi is currently facing severe challenges in resource allocation and cost control.

Currently, Xiaomi is in a high-pressure "multi-front battle" state:

On one hand, the smartphone business, which is a cash cow, is facing headwinds from skyrocketing upstream storage chip prices, putting pressure on the overall hardware gross profit margin; on the other hand, the automotive business is in a critical period of capacity ramp-up and nationwide sales network expansion, requiring continuous investment.

Moreover, compared to pure internet giants with substantial profit margins and a large cloud computing base, Xiaomi does not have an advantageous financial position in the AI arms raceFrom a strategic vision perspective, MiMo-V2-Omni is undoubtedly the most critical piece of the puzzle for Xiaomi to complete the "full ecosystem" of smart integration between people, vehicles, and homes.

In the face of rising memory prices, how to balance the "multi-line investment" in smartphones, automobiles, and large model foundations tests the wisdom of Xiaomi's management.