NVIDIA's new growth point has arrived: Heavyweight launch of Hindi AI model

NVIDIA has launched a lightweight artificial intelligence model for Hindi, aiming to enter the rapidly growing AI software market in India. This move demonstrates NVIDIA's active expansion into local language AI applications beyond its monopoly position in the AI infrastructure field. CEO Jensen Huang revealed that the latest Blackwell architecture AI GPU design flaws have been fixed, with mass shipments expected in the fourth quarter of 2024, contributing $10 billion in revenue. Jensen Huang will also meet with Mukesh Ambani, Chairman of Reliance Industries, in Mumbai

According to the Zhītōng Finance APP, global AI chip leader NVIDIA (NVDA.US) launched a lightweight artificial intelligence model specifically for Hindi, a widely used local language in India. This move highlights NVIDIA's AI ambitions in the entire Indian market, not only limited to providing AI GPUs for the development of artificial intelligence technology in India, but also aiming to enter the growing AI software market in India through local language AI models.

The day before this news was released, NVIDIA CEO Jensen Huang revealed during a visit to Denmark that with significant assistance from TSMC (TSM.US), the critical design flaw affecting chip manufacturing in their latest AI GPU - the Blackwell architecture AI GPU, has been fully rectified. Huang stated on Wednesday that the latest Blackwell architecture AI GPU had a design flaw, "it was very powerful, but the design flaw initially led to low yields." He mentioned that this flaw was "100% NVIDIA's fault," and it was only with TSMC's assistance that they were able to recover from this setback and resume chip manufacturing "at an astonishing speed."

Wall Street major firm Morgan Stanley predicts that NVIDIA's Blackwell architecture AI GPU is expected to be mass-produced in the fourth quarter of 2024, with Blackwell contributing revenue of $10 billion in Q4, and a very high profit margin.

The California-based chip giant announced on Thursday that its CEO Jensen Huang will hold face-to-face talks with Mukesh Ambani, Chairman of India's Reliance Industries and Asia's richest man, in Mumbai, the commercial capital of India.

NVIDIA stated that the company is launching a new lightweight language model in India called "Nemotron-4-Mini-Hindi-4B," with 4 billion parameters, for major Indian companies to develop their own AI applications similar to ChatGPT, or to build more powerful or specialized AI models based on this lightweight foundation.

NVIDIA mentioned in a statement, "The AI model has been pruned, distilled, and pre-trained, combining real-world Hindi big data systems, synthetic Hindi data, and roughly equal amounts of English data."

NVIDIA revealed in the statement that Tech Mahindra, an IT services and consulting company from India, is the first Indian company to develop a customized AI model named Indus 2.0 using NVIDIA's AI model products. This customized AI model, adjusted by the company, focuses on Hindi and dozens of dialects in the country.

It is known that the Indian Constitution recognizes up to 22 languages, and despite India being a member of the Commonwealth, only one-tenth of its 1.4 billion population uses English for daily communication and expression.

From large corporations to startups, Indian enterprises are currently focusing on building AI models based on their various national languages to enhance their appeal to the vast Indian consumer base, and to integrate high-quality customer service and content translation in local languages into various business operations and generative AI comprehensively. Striving to continuously improve the profitability and operational efficiency of enterprises with the help of AI.

Different from large language models with strong reasoning capabilities such as ChatGPT supported by models like GPT-4/GPT-4o introduced by OpenAI, lightweight small-scale language models have lower parameter levels, usually pre-trained on smaller but more specific datasets. They are typically more affordable and easier to customize according to the company's needs, enabling the creation of more accurate generative AI applications in certain fields, making them more attractive to companies with limited AI computing resources. Compared to general-purpose AI large models, lightweight AI models are usually used for specific tasks or applications, such as AI intelligent customer service for specific customer groups, text generation in specific languages, translation, etc.

Top global chip companies including NVIDIA, Micron, and Applied Materials are investing in India and accelerating the establishment of factories to expand their business in the rapidly growing Indian market. With substantial financial support from the Modi government, the country is striving to build its semiconductor industry chain, providing generous subsidies to foreign chip giants, especially chip manufacturing factories, to compete with major semiconductor centers globally such as South Korea, China, and Taiwan.

However, analysts generally believe that it may take several years for India to establish the most basic semiconductor industry chain, and it may be difficult to achieve this before 2030. It is understood that NVIDIA, which opened its first physical store in India nearly twenty years ago, has engineering and design centers in India and NVIDIA offices in major cities such as Bangalore in the south and Hyderabad nearby.

Software business may drive NVIDIA's stock price to touch $200

Well-known Wall Street investment firm Rosenblatt stated that they are more optimistic about the revenue scale brought by NVIDIA's software business core CUDA than NVIDIA's AI GPU revenue. Rosenblatt's chip industry analyst Hans Mosesmann significantly raised the firm's 12-month target price for NVIDIA from $140 to an astonishing $200 per share in a research report, ranking as the highest target price for NVIDIA on Wall Street.

According to the latest forecast from the renowned research firm IDC's "Global AI and Generative AI Spending Guide," the firm expects that by 2028, global spending related to artificial intelligence (AI) (with a focus on AI-supported applications, AI chips, AI infrastructure, and related IT and business services) will at least double from the current level, reaching approximately $632 billion. AI, especially Generative AI (GenAI), is rapidly integrating into various terminal devices and products. IDC predicts that global AI spending will achieve a compound annual growth rate (CAGR) of 29.0% in the forecast period of 2024-2028.

IDC points out that software or applications will be the largest category of AI technology spending, accounting for over half of the entire AI market in most forecasts. IDC predicts that the five-year CAGR of AI software will reach 33.9% in the period of 2024-2028 This is also why NVIDIA not only focuses on the revenue scale of AI GPUs, which can be described as a monopoly in the AI infrastructure field, but also recently focuses more on AI large language models, lightweight models, and promoting global enterprises to deploy AI applications based on NVIDIA's high-performance AI GPU cloud inference power and accelerated deployment on the CUDA platform.

The CUDA ecosystem barrier can be described as NVIDIA's "strongest moat". NVIDIA has been deeply cultivating in the global high-performance computing field for many years, especially its CUDA computing platform created from scratch has become popular worldwide, serving as the preferred software and hardware collaborative system for high-performance computing in AI training/inference and other fields. The CUDA accelerated computing ecosystem is an exclusive parallel computing acceleration platform and programming assistance software developed by NVIDIA, allowing software developers and engineers to use NVIDIA GPUs for parallel general-purpose computing acceleration (only compatible with NVIDIA GPUs, not compatible with mainstream GPUs such as AMD and Intel).

Regarding how to fine-tune AI models and how to customize AI models with low technical thresholds and without the need to purchase any expensive AI GPUs, NVIDIA also provides a complete cloud-based solution for enterprises or individual users. Based on the extremely powerful and highly penetrative CUDA platform and high-performance AI GPUs, NVIDIA has continuously strengthened its layout in the software and hardware full-stack ecosystem recently. NVIDIA officially launched a microservice called "NVIDIA NIM" at the March GTC, which charges based on GPU usage time. It is a cloud-native microservice focused on optimization, aiming to shorten the time to market for generative AI applications based on various types of AI models and simplify their deployment workloads on the cloud, data centers, and GPU-accelerated workstations, enabling enterprises to deploy AI applications on NVIDIA's AI GPU cloud inference power and accelerated deployment on the CUDA platform, seeking to establish an exclusive NVIDIA GPU system for AI application full-stack development ecosystem