The Chinese quantitative funds that have become AI pioneers

One of the Chinese AI candidates taking on OpenAI comes from an unusual source: a quant fund that dominates the country’s financial sector.

High-Flyer Capital Management, a Chinese quantitative hedge fund, has grown into a roughly Rmb60 billion ($8 billion) asset manager since launching in 2015, using in part AI and algorithms to identify patterns or variables that can influence stock prices.

Now it has turned that knowledge and infrastructure into a powerful AI model that has been released and that experts say is on par with leading Western efforts. DeepSeek-V2 can answer questions, write code and reason.

DeepSeek costs significantly less than rivals, around Rmb2 for every million output tokens – or words returned per search – leading to a price war between Chinese artificial intelligence vendors.

A week after its launch in May, tech giant ByteDance cut prices to just Rmb0.60 per million output tokens. Rival Alibaba then cut usage prices on some of its models by as much as 97 percent and Baidu made two of its Ernie models free.

The rollout of the new model, which has quickly attracted thousands of Chinese developers, shows how tech giants like Baidu and Alibaba, even with early leads in generative AI, face stiff competition from more agile upstarts. It has also put a spotlight on China’s highly competitive generative AI race.

“The gap between the US and China is not as big as everyone thinks,” Liu Qingfeng, founder of Chinese AI group iFlytek, told a recent technology meeting in Macau. “We are in many industries [models] are better than theirs.”

The development of DeepSeek is made possible by funding from sister hedge fund High-Flyer. The funds have returned 151 percent, or 13 percent annualized, since 2017 on China’s battered domestic stock market. The country’s CSI 300 index, which tracks China’s 300 largest stocks, rose 8 percent over the same period, according to research firm Simu Paipai.

In February, Beijing cracked down on quantitative funds, blaming a stock market sell-off at the start of the year for their rapid algorithmic trading. Since then, High-Flyer’s funds have lagged the CSI 300 by four percentage points.

High-Flyer and DeepSeek did not respond to requests for comment.

The quantitative fund started in an apartment in Chengdu, where founder Liang Wenfeng, a computer science graduate from Zhejiang University, experimented with automated stock trading, according to local media reports. According to his profile in China’s Register of Asset Management Organizations, he was a freelancer until 2013, when he founded his first investment firm.

According to manager Cai Liyu, all of High-Flyer’s strategies used AI in 2021, using strategies similar to those of hugely profitable hedge fund Renaissance Technologies. “AI helps extract valuable data from huge data sets, which can be useful for predicting stock prices and making investment decisions,” he said at a roadshow streamed online that year.

Cai said the company’s first computer cluster cost nearly Rmb200 million and High Flyer invested about Rmb1 billion in building a second supercomputer cluster, which would span an area of ​​about a football field. Most of their profits went back into their AI infrastructure, he added.

The second cluster, now complete, connects more than 10,000 of Nvidia’s advanced processors to servers and storage, giving DeepSeek the computing power to train a large model, according to archived versions of the company’s website. The group acquired the Nvidia A100 chips before Washington restricted supplies to China in mid-2022.

“We always wanted to conduct experiments on a larger scale, so we have always strived to use as much computing power as possible,” founder Liang told Chinese technology site 36Kr last year. “We wanted to find a paradigm that can fully describe the entire financial market.”

The company is one of six Chinese groups with more than 10,000 A100 processors, which is widely believed to be a computational threshold for training large models on their own, according to Guosheng Securities. The other five are all Chinese tech giants, though their collective computing power pales in comparison to American companies. Meta has said it will have computing power equivalent to nearly 600,000 of Nvidia’s more advanced H100 chips by the end of the year.

Tests conducted by research groups rank DeepSeek-V2 among the best LLMs in the world. Researchers from the University of Waterloo in Canada scored it in the top 10 models behind OpenAI’s GPT-4, Anthropic’s Claude and Chinese rival 01.AI.

DeepSeek’s model is also open source, allowing AI researchers to inspect and copy its structure.

“The architecture of the model is very unique,” ​​says Andrew Carr, chief scientist at Cartwheel, an AI animation start-up based in the US. “DeepSeek has taken this idea of ​​expert mix, where you break a model into smaller pieces, to the extreme, with hundreds of small experts.”

Carr said the model was close to Meta’s latest Llama 3 model, but at a lower price point. The price is about 100th the cost of OpenAI’s GPT-4 and one-fifth of Anthropic’s Claude 3 Haiku.

Tiezhen Wang, an engineer at New York-based AI research center Hugging Face, said DeepSeek’s team had reduced what the model had to remember while allowing it to “accomplish more tasks without slowing down.”

Within China, the pricing strategy has helped developers sign up. Wang Zixu, a programmer from northern China, said he switched from using OpenAI’s GPT-4 for coding assistance to DeepSeek because of lower prices.

Even with the cost advantage, some industry experts said DeepSeek could lose money at its low price. Nvidia’s computing power could also fall further behind if Nvidia releases new chips that are not allowed to be exported to China.

Still, High-Flyer’s AI offshoot aims to be the first to achieve artificial general intelligence, the point at which machines have greater cognitive capabilities than humans.

“We believe AGI is the violent beauty of model x data x computing power,” says a job ad for DeepSeek. “Start with us on a ‘deep search’ on the road to AGI!”

Additional reporting by Nian Liu in Beijing

Leave a Comment