By Eric Vandenbroeck and co-workers
The Illusion Of China’s AI Prowess
The artificial intelligence revolution has reached the US Congress. The staggering potential of robust AI systems, such as OpenAI’s text-based ChatGPT, has alarmed legislators, who worry about how advances in this fast-moving technology might remake economic and social life. Recent months have seen a flurry of hearings and behind-the-scenes negotiations on Capitol Hill as lawmakers and regulators determine how best to impose limits on the technology. But some fear that any regulation of the AI industry will incur a geopolitical cost. In a May hearing at the U.S. Senate, Sam Altman, the CEO of OpenAI, warned that “a peril” of AI regulation is that “you slow down American industry in such a way that China or somebody else makes faster progress.” That same month, AI entrepreneur Alexandr Wang insisted that “the United States is in a relatively precarious position, and we have to make sure we move fastest on the technology.” Indeed, the notion that Washington’s propensity for red tape could hurt it in the competition with Beijing has long occupied government and private sector figures. Former Google CEO Eric Schmidt claimed in 2021 that “China is not busy stopping things because of regulation.” According to this thinking, if the United States places guardrails around AI, it could end up surrendering international AI leadership to China.
In the abstract, these concerns make sense. It would not serve U.S. interests if a regulatory crackdown crippled the domestic AI industry while Chinese AI companies, unshackled, could flourish. But a closer look at the development of AI in China—especially that of large language models(LLMs), the text generation systems that underlie applications such as ChatGPT—shows that such fears are overblown. Chinese LLMs lag behind their U.S. counterparts and largely depend on American research and technology. Moreover, Chinese AI developers face a far more stringent and limiting political, regulatory, and economic environment than their U.S. counterparts. Even if it were true that new regulations would slow innovation in the United States—and it may not be—China does not appear poised to surge ahead.
U.S. companies are building and deploying AI tools at an unprecedented pace, so much so that even they are actively seeking guidance from Washington. This means that policymakers considering how to regulate the technology are in a position of strength, not weakness. Left untended, the harms from today’s AI systems will continue to multiply while the new dangers produced by plans will go unchecked. An inflated impression of Chinese prowess should not prevent the United States from taking meaningful and necessary action now.
The Sincerest Form Of Flattery
Over the past three years, Chinese labs have rapidly followed in the footsteps of U.S. and British companies, building AI systems similar to OpenAI’s GPT-3 (the forerunner to ChatGPT), Google’s PaLM, and DeepMind’s Chinchilla. But in many cases, the hype surrounding Chinese models has masked a lack of real substance. Chinese AI researchers, we have spoken with believe that Chinese LLMs are at least two or three years behind their state-of-the-art counterparts in the United States—perhaps even more. Worse, AI advances in China rely greatly on reproducing and tweaking research published abroad, a dependence that could make it hard for Chinese companies to assume a leading role in the field. If the pace of innovation slackened elsewhere, China’s efforts to build LLMs—like a slower cyclist coasting in the leaders’ slipstream—would likely decelerate.
Take, for instance, the Beijing Academy of Artificial Intelligence’s WuDao 2.0 model. After its release in the summer of 2021, Forbes was thrilled at the model as an example of “bigger, stronger, faster AI,” largely because WuDao 2.0 boasted ten times more parameters—the numbers inside an AI model that determine how it processes information—than GPT-3. But this assessment was misleading in several ways. Merely having more parameters does not make one AI system better than another, especially if not matched by corresponding increases in data and computing power. In this case, comparing parameter counts was especially unwarranted given that WuDao 2.0 worked by combining predictions from a series of models rather than as a single language model, a design that artificially inflated the parameter count. Moreover, the way researchers posed questions to the model helped its performance in specific trials appear stronger than it was.
Baidu’s “Ernie Bot” was also disappointed. Touted as China’s answer to ChatGPT, the development of Ernie Bot was clearly—like that of WuDao 2.0—spurred by pressure to keep up with a high-profile breakthrough in the United States. The Chinese bot failed to live up to aspirations. Baidu’s launch event included only prerecorded examples of its operation, a telltale sign that the chatbot was unlikely to perform well in live interactions. Reviews from users who have since gained access to Ernie Bot have been mediocre at best, with the chatbot stumbling on simple tasks such as basic math or translation questions.
Chinese AI developers struggle with the pressure to keep up with their U.S. counterparts. In August 2021, more than 100 researchers at Stanford collaborated on a significant paper about the future of so-called “foundation models,” a category of AI systems that includes LLMs. Seven months later, the Beijing Academy of AI released a similarly lengthy literature review on a related subject, with almost as many co-authors. But within a few weeks, a researcher at Google discovered that large sections of the Chinese paper had been plagiarized from a handful of international papers—perhaps, Chinese-language media speculated, because the graduate students involved in drafting the essay faced extreme pressure and were up against very short deadlines.
The imminent Chinese surge in LLM development should not haunt Americans. Chinese AI teams are fighting—and often failing—to keep up with the blistering speed of new research and products emerging elsewhere. Regarding LLMs, China trails years, not months, behind its international competitors.
Headwinds And Handicaps
Forces external to the AI industry also impede the pace of innovation in China. Due to the outsized computational demands of LLMs, international competition over semiconductors inevitably affects AI research and development. The Chinese semiconductor industry can only produce chips several generations behind the latest cutting-edge ones, forcing many Chinese labs to rely on high-end chips developed by U.S. firms. In recent research analyzing Chinese LLMs, we found 17 models that used chips produced by the California-based firm NVIDIA; by contrast, we identified only three models built with Chinese-made chips.
Huawei’s PanGu-α, released in 2021, was one of the three exceptions. Trained using Huawei’s in-house Ascend processors, the model appears to have been developed with significantly less computational power than best practices recommend. Although it is currently perfectly legal for Chinese research groups to access cutting-edge U.S. chips by renting hardware from cloud providers such as Amazon or Microsoft, Beijing must be worried that the intensifying rhetoric and restrictions around semiconductors will hobble its AI companies and researchers.
More broadly, pessimism about China's overall economic and technological outlook may hamper domestic AI efforts. In response to a wave of regulatory scrutiny and a significant economic slowdown in the country, many Chinese startups are now opting to base their operations overseas and sell to an international market instead of selling primarily to the Chinese market. This shift has been driven by the increasing desire among Chinese entrepreneurs to gain easier access to foreign investment and to escape China’s stringent regulatory environment—while also skirting restrictions imposed on Chinese companies by the United States.
Hal, Meet Big Brother
China’s thicket of restrictions on speech also poses a unique challenge to the development and deployment of LLMs. The freewheeling way LLMs operate—following the user’s lead to produce text on any topic, in any style—is a poor fit for China’s strict censorship rules. In a private conversation with one of us, one Chinese CEO quipped that China’s LLMs are not even allowed to count to 10, as that would include the numbers eight and nine—a reference to the state’s sensitivity about the number 89 and any discussion of the 1989 Tiananmen Square protests.
Because the inner workings of LLMs are poorly understood—even by their creators—existing methods for putting boundaries around what they can and cannot say function more like sledgehammers than scalpels. Companies face a stark tradeoff between how useful the AI’s responses are and how well they avoid undesirable topics. LLM providers everywhere are still figuring out how to navigate this tradeoff, but the potentially severe ramifications of a misstep in China force companies there to choose a more conservative approach. Popular products such as the Microsoft spinout XiaoIce are prohibited from discussing politically sensitive topics such as the Tiananmen Square protests or Chinese leader Xi Jinping. Some users we spoke to even claim that XiaoIce has gotten less functional over time, perhaps as Microsoft has added additional guardrails. Journalists have likewise found that Baidu’s Ernie Bot gives canned answers to questions about Xi and refuses to respond on other politically charged topics. Given the wide range of censored opinions and subjects in China—from the health of the Chinese economy to the progress of the war in Ukraine to the definition of “democracy”—developers will struggle to make chatbots that do not cross red lines while still being able to answer most questions typically and effectively.
In addition to these political constraints on speech, Chinese AI companies are also subject to the country’s unusually detailed and demanding regulatory regime for AI. One set of rules came into force in January 2023 and applied to providers of online services that use generative AI, including LLMs. A draft of further requirements, which would apply to research and development practices and AI products, was released for comment in April.
Some rules are straightforward, such as requiring that sensitive data be handled according to China’s broader data governance regime. Other provisions may prove quite onerous. The January regulations, for instance, oblige providers to “dispel rumors” spread using content generated by their products, meaning that companies are on the hook if their AI tools produce information or opinions that go against the Chinese Communist Party line. The April draft would still go further, forcing LLM developers to verify the truth and accuracy of what the AI programs produce and the material used to train the programs in the first place. This requirement could be a severe headache in a field that relies on massive stores of data scraped from the Web. When carefully designed, regulation need not obstruct innovation. But so far, the CCP’s approach to regulating LLMs and other generative AI technology appears so heavy-handed that it could prove a real impediment to Chinese firms and researchers.
Fear Of The Chimera
Despite its difficulties, Chinese AI development may yet turn a corner and establish a greater track record of success and innovation. Americans, however, have a history of overestimating the technological prowess of their competitors. During the Cold War, bloated estimates of Soviet capabilities led U.S. officials to make policy based on a hypothesized “bomber gap” and then a “missile gap,” which were later proved to be fictional. A similarly groundless sense of anxiety should not determine the course of AI regulation in the United States. After all, where social media companies resisted regulation, AI firms have already asked for it. Five years ago, Facebook founder Mark Zuckerberg warned Congress that breaking up his social media company would only strengthen Chinese counterparts. In AI, by contrast, industry leaders are proactively calling for regulation.
If anything, regulation is where the United States most risks falling behind in AI. China’s recent regulations on generative AI build on top of existing rules and a detailed data governance regime. The European Union, for its part, is well on its way to passing new rules about AI in the form of the AI Act, which would categorize levels of risk and impose additional requirements for LLMs. The United States has not yet matched such regulatory efforts, but even here, U.S. policymakers are in better shape than often assumed. The federal government has drafted thorough frameworks for managing AI risks and harms, including the White House’s Blueprint for an AI Bill of Rights and the National Institute for Standards and Technology’s AI Risk Management Framework. These documents provide in-depth guidance on navigating this general-purpose technology's multifaceted risks, harms, and benefits. What is needed now is legislation that allows the enforcement of the key tenets of these frameworks to protect the rights of citizens and place guardrails around the rapid advance of AI research.
There are still plenty of issues to work through, including where new regulatory authorities should be housed, what role third-party auditors can play, what transparency requirements should look like, and how to apportion liability when things go wrong. These are thorny, urgent questions that will shape the future of technology, and they deserve serious effort and policy attention. If the chimera of Chinese AI mastery dissuades policymakers from pursuing industry regulation, they will only be hurting related interests.