摘要:AsianFin -- Ant Group, an affiliate company of Chinese conglomerate Alibaba Group, has developed AI training techniques using Chin
(Image credit: Photo by editor Lin Zhijia)
AsianFin -- Ant Group, an affiliate company of Chinese conglomerate Alibaba Group, has developed AI training techniques using Chinese-made semiconductors that could reduce costs by 20%.
The fintech giant leveraged domestic chips from affiliates such as Alibaba Group Holding Ltd. and Huawei Technologies Co. to train models through the Mixture of Experts (MoE) machine learning approach, the sources said.
These models reportedly achieved performance levels comparable to those trained on Nvidia's H800 GPUs. While Ant continues to use Nvidia chips for AI development, it has increasingly turned to alternatives, including Advanced Micro Devices Inc. and other Chinese semiconductor providers, one source added.
Ant's latest development signals its entry into the intensifying competition between Chinese and U.S. firms to develop cutting-edge AI models. This race has gained momentum after DeepSeek demonstrated that highly capable AI models can be trained at a fraction of the cost spent by OpenAI and Alphabet Inc.'s Google.
The shift also underscores how Chinese companies are working to reduce reliance on Nvidia's advanced semiconductors, which are subject to U.S. export restrictions. While the H800 is not the most advanced Nvidia GPU, it remains one of the most powerful AI chips currently banned from sale to China.
This month, Ant published a research paper claiming that its AI models outperformed Meta Platforms Inc. by certain benchmarks. While these claims have not be independently verified, such advancements could be significant if confirmed. If Ant's technology performs as advertised, it could enhance China's AI development by lowering the cost of inference and supporting a wider range of AI applications.
The MoE machine learning technique, which Ant has adopted, has gained traction among AI leaders such as Google and DeepSeek. This method breaks down tasks into smaller specialized segments, much like a team of experts each handling different parts of a job, leading to more efficient processing.
However, training MoE models typically requires high-performance GPUs, such as those from Nvidia. The prohibitive cost of these chips has restricted broader adoption, particularly among smaller AI firms. Ant has been actively working to make large language model (LLM) training more efficient, aiming to reduce dependence on premium GPUs. The company's research paper explicitly states its goal of scaling AI training without relying on high-end Nvidia chips.
This approach contradicts Nvidia's long-term vision. Jensen Huang, the CEO of Nvidia, has argued that AI computation demand will continue to grow, even with more efficient models like DeepSeek's R1. He maintains that companies will prioritize better chips to generate revenue, rather than cheaper chips to cut costs. As a result, Nvidia has continued its strategy of building increasingly powerful GPUs with higher processing power, more transistors, and greater memory capacity.
Ant Group's research highlights the rapid innovation within China's AI industry and suggests that the nation is making strides toward AI self-sufficiency. By adopting cost-efficient, computationally optimized AI models, China is actively working around U.S. restrictions on advanced Nvidia chips.
According to Ant's estimates, training 1 trillion tokens—a fundamental unit of data used in AI learning—currently costs around 6.35 million yuan ($880,000) using conventional high-performance hardware. However, Ant's optimized approach would lower this cost to 5.1 million yuan, thanks to its ability to train models on less powerful hardware.
The company plans to apply its AI breakthroughs to industrial applications, particularly in healthcare and finance, sources said. Its Ling-Plus and Ling-Lite models, developed as part of this initiative, are expected to play a key role in these sectors.
Earlier this year, Ant acquired Chinese online platform Haodf.com, strengthening its AI-driven healthcare services. The company has also launched AI-powered applications, including Zhixiaobao, an AI-based life assistant, and Maxiaocai, a financial advisory AI service.
Ant's research suggests that its Ling-Lite model outperforms a Meta Llama model in English-language understanding benchmarks. Additionally, both Ling-Lite and Ling-Plus surpassed DeepSeek's equivalent models in Chinese-language tasks, demonstrating China's growing AI capabilities.
Robin Yu, Chief Technology Officer at Beijing-based Shengshang Tech Co., compared AI competition to martial arts, saying "If you find one point of attack to beat the world's best kung fu master, you can still say you beat them. That's why real-world application matters."
Ant has open-sourced its Ling models, allowing researchers and developers worldwide to explore its innovations. The Ling-Lite model has 16.8 billion parameters, while Ling-Plus features 290 billion parameters. These parameters serve as the adjustable settings that fine-tune an AI model's performance.
For comparison, industry experts estimate that ChatGPT's GPT-4.5—which has not been officially disclosed—likely contains around 1.8 trillion parameters, according to MIT Technology Review. DeepSeek-R1, another Chinese model, boasts 671 billion parameters.
Despite its achievements, Ant faced difficulties during training, particularly in ensuring model stability. The company noted that even minor changes in hardware or the model's architecture caused instability, leading to spikes in the error rate. These challenges highlight the complexity of building AI models without relying on the industry's most advanced chips.
As China accelerates its push toward AI self-reliance, Ant Group's latest advancements reflect the country's determination to innovate despite export restrictions. If successful, these breakthroughs could reduce China’s dependence on foreign semiconductors—its key goal in the ongoing U.S.-China technology rivalry.
来源:钛媒体