The state of China’s generative AI

Politics & Current Affairs

This week on Sinica Kaiser welcomes back Paul Triolo, senior VP for China and technology policy lead at Dentons Global Advisors, to discuss the state of play in Chinese Generative AI.

Illustration for The China Project by Derek Zheng

Below is a complete transcript of the Sinica Podcast with Paul Triolo.

Kaiser Kuo: Welcome to the Sinica Podcast, a weekly discussion of current affairs in China, produced in partnership with The China Project. Subscribe to Access from The China Project to get, well, access. Access to not only our great newsletter, the Daily Dispatch, but to all of the original writing on our website at thechinaproject.com. We’ve got reported stories, essays and editorials, great explainers, regular columns, and, of course, a growing library of podcasts. We cover everything from China’s fraught foreign relations to its ingenious entrepreneurs, from the ongoing repression of Uyghurs and other Muslim peoples in China’s Xinjiang region to Beijing’s ambitious plans to shift the Chinese economy onto a post-carbon footing. It’s a feast of business, political, and cultural news about a nation that is reshaping the world. We cover China with neither fear nor favor.

I’m Kaiser Kuo, coming to you from Chapel Hill, North Carolina. The sudden arrival of Generative AI, large language models like ChatGPT or Google’s Bard, as well as the proliferation of image, video, and audio AI, certainly feels, at least from my sit, like a technological step change just from a few months ago, really. As we’ve talked about on previous shows, Chinese AI researchers were astonished at what they saw once OpenAI took the wraps off of ChatGPT a few months ago. And when some Chinese companies hurriedly released their own LLMs, reviewers, for the most part, agreed that they were not quite ready for primetime.

All this really lit a fire to China’s AI scientists and researchers, whether in tech companies or in research institutes. The fact that the draft of the initial regulations governing Generative AI, something that we talked about on the show a few months ago, had their most onerous and potentially cumbersome requirements actually stripped out of them before the actual regulations were introduced. It shows that Beijing recognizes the importance of this technology and wants, at the very least, to avoid hobbling its domestic companies before they even get out of the starting gate.

But will China’s heavy-handed internet censorship hamper its development? And what about U.S. efforts to keep the cutting-edge GPUs, the graphics processing units, or the NPUs, the neural processing units, out of Chinese hands? With me, to talk about the state of play in Chinese Generative AI, is Paul Triolo, senior VP for China and technology policy lead at Dentons Global Advisors, ASG formerly, and probably better, still known as Albright Stonebridge Group. He spent a full quarter-century as a research analyst for the U.S. government at a certain three-letter agency, and he contributes frequently to The China Project, where he writes on all sorts of technology issues pertaining to China, for which we are very grateful. Paul joins us from Washington, D. C. Paul, welcome back to Sinica.

Paul Triolo: Thank you, Kaiser. I’m happy to be back.

Kaiser: Yeah. Well, the last time I saw you was actually just a few weeks ago in DC, and you were leaving the very next morning for China. So, tell us, just in a nutshell, how was that trip?

Paul: It was a great trip, Kaiser. It was largely client-focused, but I managed to find time to discuss this issue of Generative AI with some of the leading Chinese companies that are developing their own large language models and have been doing a lot in the AI space.

Kaiser: Yeah. So that’s exactly why I wanted to talk to you because you’re fresh from there. You’ve had a lot of conversations with people across the AI community in China. So how did that community react to the unveiling of ChatGPT by OpenAI?

Paul: Yeah, it’s a good question, and it sort of hit two different pieces of the system, right? So, it hit the sort of enterprises in China that are developing these types of capabilities. So, it wasn’t a surprise because the existence of transformers and large language models had been well known in China, which is very plugged in. Chinese companies are very plugged in, and Chinese researchers, to the global sort of AI community. So, not unknown. I think they were surprised that a company like OpenAI decided to release this to the public as an experiment. There was some surprise that they were able to do that. That probably couldn’t happen in China in the same way.

And then the regulators, I think, in China too were also sort of forced to respond to this because I think up until that point, regulators were certainly aware of these developments, but they hadn’t really thought necessarily or had really galvanized themselves to sort of figure out how to regulate things like Generative AI. And because of the content, I think it’s important to note that it’s because of the human interaction and the sort of text, not just text, but the sort of text and image and multimodal kinds of Generative AI models, this becomes pretty quickly an issue that regulators are interested in trying to figure out how to get ahead of. And I think that the release of ChatGPT certainly got the attention of both the AI community in China in that sort of alarmist way; like we better catch up because OpenAI is prepped ahead now in the development of these models.

And also the regulators were thinking, “Okay, well, if this is going to be out there, we better figure out what to do about it,” in particular, because it’s so focused on natural language processing in particular, and sort of human interaction are the part that is of concern to regulators.

Kaiser: No, that makes a ton of sense. So, can you talk broadly about what was on the minds of the technologists you spoke with? I mean, are they smarting over the October 7th announcement? Are they really, really worried? We’ll talk a little bit more about the GPU issue, but are they obsessed with Generative AI, or what’s top of mind for them?

Paul: Well, I mean, I think it’s important to just step back quickly and note that since 2017, 2018, when China came out with the National AI Development Plan, for example, AI has not been a sort of a new thing for Chinese technology companies. I mean, going back a decade, Chinese companies have been tracking the cutting-edge developments around the world. And I think in China; the emphasis has been much more on the sort of business and practical implications of AI, right? And so when I did a paper with Kai-Fu Lee, for example, in 2017, that previewed his book, AI Superpowers, we focused on different types of AI like perception AI, internet AI, recommendation algorithms, and logistics, and then sort of autonomous AI โ€” these four sectors. We didn’t, at that time, of course, include things like Generative AI.

So I think that AI companies in China have been very focused and foresighted in continuing to develop very practical applications for AI over the past five years. Generative AI sort of injects this new wrinkle into the game because it has implications in terms of increased productivity and sort of all these areas that people have held out hope for, for faster processing of texts on languages and assisting in things like code generation, which is a huge potential application that’s already being used for Generative AI. So, Generative AI sort of changes the game a little bit because it introduces new business models and new areas where AI can be applied. And so, Chinese companies are, again, very plugged into sort of the directionality of the AI development process.

And so they, of course, were already looking very hard at things like transformers and the large language models built on those transformers. And so I think they were, again, not surprised, but their focus has been a little different than that of companies in the U.S., like OpenAI in particular, and then all the other many companies that are now focused on generating large language models and applying them in various industry verticals. They’re probably more than any other country, and more than any other AI community are really plugged into what’s going on and tracking developments and trying to take advantage of the sort of state of the art.

Kaiser: I don’t think there’s any doubt that in China right now, the flavor of the month is Generative AI, right? There are, of course, as you say, all these other approaches, some of which China has made real headway in. And some of these are continuing to yield major breakthroughs. I think it’s important that we at least talk a little bit about those. Talk about what some of these other areas are. Is there concern that these things might get crowded out due to the interest in LLMs and other Generative AI? I mean, we were talking the other day about things like AlphaFold that Google DeepMind has done. I mean, these are deep reinforcement learning-based AI platforms. Can you talk a little bit about that and the promise that that holds out?

Paul: Yeah, I mean, and of course, one of the big factors that drove that National AI Development strategy, for example, released in 2017, was this concern over Go and AlphaGo, and the perception that Western companies like Google DeepMind were leading in these areas and had developed these amazing capabilities using, again, deep reinforcement learning. It seems this is sort of the process that led to something like AlphaFold, which is this very specialized AI capability designed to predict the shape of proteins that has very, very specific and very useful end-use. And so, I think Chinese companies have also been looking at those kinds of applications across things like healthcare, obviously, and drug discovery and those kinds of areas. So, I think that kind of interest is still there. Generative AI, though, has obviously now, as you know, gobbled up a lot of sort of attention and investment in part because it is a broader general-purpose capability.

And that’s, of course, what generates a lot of concern is that it is designed to sort of eventually lead to some sort of artificial general intelligence. And this seems closer to that than something like AlphaGo or AlphaFold, which are very narrow applications to gaming, for example, and to things like protein predictions. So, Generative AI sort of generates a lot of fear in some sense about things like, “Hey, we’re going to get eventually to something that’s smarter than humans.” And that’s generated that huge debate in the U.S. about whether there’s a sort of existential threat from AI.

Kaiser: In the US, that certainly has been, but what about China?

Paul: Yeah, I think, in China, that debate hasn’t been as sort of public. One of the issues is that I think in China, the tech companies have sort of led the development of this. There isn’t this huge independent AI safety community of the type that has developed in the U.S. and in Europe around looking at regulation and worried about all these downsides of AI like bias and disinformation and these other areas. I think in China; the tendency has been to see technology in general and AI as beneficial. And so, to emphasize more the beneficial aspects, the potential gains from deploying AI applications. Where in the West, we have this very big and large and growing community that’s focused on safety and ensuring that AI is deployed and fair in equitable ways.

And so, in China, there’s not this rush to think about the existential threats. The companies are very focused, for example, on earning revenue using these AI applications, whether it’s with logistics or other things we’ve talked about. And then Generative AI falls into that category of, “Hey, here’s a new tool, and we can use this to help develop specific applications that will be useful to companies.” And so they haven’t leaped to the sort of existential risks as we saw in the last few months in the U.S. with these various documents coming out, and people signing off letters, various letters that called for concern about this potential threat, existential threat from AI.

Kaiser: So, so far, no Chinese Eliezer Yudkowsky has emerged?

Paul: No, that’s a good question. There was one professor, I think from Tsinghua, who signed onto one of the letters. But no, not the debate you have in the West where the fathers of AI, for example, Yann LeCun, Jeff Hinton, and Yoshua Bengio, have been part of this huge debate over the last particularly six months about the downsides on both sides of the topic. And in China, again, the Chinese companies and the Chinese AI community have been sort of more focused on basic research and on application.

Kaiser: Well, as I’ve said many times, China is in its Star Trek phase, and we are in our black mirror phase already. So yeah, that’s how it goes. So as you know, the regulations on Generative AI were officially promulgated just, I guess, about a month or so ago after a comment period on the initial draft regs. I talked about this, as I said, briefly on the show with Jeremy Daum after they came out. But that show was about something else entirely, and so we just sort of flicked at it. But let me just summarize quickly what Jeremy said because he studied this stuff very, very carefully. First, he said, overall, they are much less stringent than what was laid out in the draft. And they appear to have backed down on certain requirements with the aim, specifically, of allowing the new industry to really develop and flourish.

Service providers no longer need to guarantee the truth, so to speak, of either the generated content or even of the underlying data, which was supposed to be, that the training data was supposed to actually be sort of combed over for truth. That was obviously a major hurdle and was basically impossible. I imagine the development community is pretty happy about that change. They also added to this point about concern over safety; I’m glad to see this is coming from the CAC itself, categories of discrimination. There’s language in there about bias, but they’ve added more categories of people they’re worried may be discriminated against, including people with disabilities.

And perhaps most importantly, I think maybe they made these new regulations apply only to the public-facing platforms, right? The public-facing Generative AI. So, it doesn’t apply to systems that are purpose-built just for an enterprise or within, even like an industry vertical.

Paul: Right. And that’s critical. Yeah.

Kaiser: Yeah, absolutely critical. So, with all that as background, when you spoke with AI researchers in China during your trip, did you get a sense of whether they felt like their feedback had helped to bring about this change? Or did they talk about the AI regs and whether they still need to be tweaked?

Paul: Yeah, I talked with a lot of people about this. I think it’s important to note that when they work that out, there’s still interim regulations. Part of the game here is that the content regulator here, which is really putting out these regulations, the Cyberspace Administration in China, is very, very focused on the content part of it. And they’re still trying to figure out what to do about all this, right? So the Chinese approach is interesting, and we can talk about the comparison to what the EU and the U.S. are doing because it’s quite different. The regulators in China tend to want to get out in front of this quickly. And so that’s what they’ve done. It’s why they issued these draft regulations in April and quickly turned them around into these interim regs.

And yes, the industry had a huge input on this. I talked to a lot of people, Tencent, Alibaba, Baidu, all the key players, all the key sort of research institutes, the Beijing Academy of Artificial Intelligence, all the major big players in sort of LLMs in General AI in China, absolutely weighed in on those draft drags. And now, once they came out, they were really calling them something like a light touch regulation because of those things you mentioned that they roll back a lot of the provisions. For example, another one that was important was they had talked about needing to turn around and redo the model if there was a problem detected with the content that was sort of politically incorrect.

They dropped a two-month or three-month turnaround on that because that was considered by the industry to be sort of wildly unreasonable, right? Because it takes time to train these models and to do things like reinforcement learning through human feedback, which is an effort to put guardrails on this. But I think it’s important to notice that, as you said, the thrust of those regulations was really for this public-facing sort of applications of these models. And that’s not where the Chinese companies are focused. They’re very much focused on these industry verticals, which I thinkโ€ฆ And we can talk about each of the major players and where they are in that. Therefore, they’re less concerned, I would say, about the regulation than some of these other issues which we need to talk about, like U.S. export controls on GPUs, which is a huge issue for the industry in general. But I think, in general, people were happy with the process of feedback and the fact that CAC listened to industry concerns.

But again, these are interim regulations. There probably is more to come. And then the last thing I’ll mention on that critically is yesterday, the TC260, which is the standard body, released some really interesting standards drafts related to things like watermarking, to actually watermark content to make sure that it’s clear what has been AI-generated. And it’s a really thoughtful document; if you read through it, it talks about two kinds of watermarks related to generative AI content. One is explicit, a sort of human visible thing, an audible message that would identify content as generated by an AI, large language model, for example, or image generator, and implicit, which would be a sort of watermark that humans can’t see, but computers and machines can extract and sort of determine the origin.

In that very practical area, which, again, is also under lots of discussion in other key areas in the U.S. and EU, for example โ€” how do you identify content that is AI generated? China is already sort of arguably well ahead in terms of at least attempting to put out these kinds of standards around this. And I think that might be, for example, an area of collaboration with the EU and the U.S. around some kind of a standard around how you do that. Because If you don’t have an international standard on that, for example, it’s going to be a mess, right? Because-

Kaiser: Yeah. It’s encouraging that there’s some convergence on this issue, at least yet.

Paul: Yeah. Absolutely. I think that standard area is something hopefully where there could be more international convergence because it’s so important going forward. And so the generative AI interim regs talked about that. And then pretty quickly, and we’re only about a month later, TC260 has come out with these standards. So I think it shows that there is a huge commitment in the system to pursuing and flushing out these kinds of standards and regulations.

Kaiser: So while we’re on the subject of the state, aside from wanting to or being willing to clear some of the regulatory obstacles or at least refrain from placing too many obstacles in their path, what else is the state doing to advance these company’s abilities to develop Generative AI and to conduct that kind of research in China? Are they getting any kind of proactive state support?

Paul: Yeah. I think that what you’re seeing the state do is signal, at many levels, that this is a priority industry, right? And so Xi Jinping is, well, he makes comments on the importance of things like ai. And so over the last year in particular, and really arguably almost since the 2017 issuance of this national strategy, it’s clear that the government is very supportive of this industry and supports things like industry associations that are focused on ai. Different municipalities like Shanghai and others have all put out favorable guidance and regulations around elements of the AI technology stack. And so, I think there’s a general signal from the center that this is something, an area where Chinese companies should lead, right?

And again, there’s this consideration that there are some low lower barriers to entry here except for the hardware part of the stack, which we need to talk about. But I think that generally, the regulations, and also there’s a mod to this in the generative AI regulations, which is they’re trying to clearly balance innovation and regulation here. And this is an area where, because there are so many capable Chinese companies in this space, there’s a real desire, specifically on Generative AI, to channel things into less sensitive areas like enterprise applications, manage the content piece through these regulations, and then provide things like compute capabilities. And so, for example, if you want, if you talkโ€ฆ I don’t know if you had a show on the East Data West Compute Project, which is a huge national effort to establish a series of data centers with advanced computing capabilities and make that available, for example, to companies to develop AI applications.

Kaiser: Yeah, Kendra Schaefer flicked that, yeah.

Paul: They’re those kinds of very supportive projects which are designed to sort of help fuel the industry. So, I think the government around the industry is trying to build a sort of supportive framework and regulatory framework and encourage companies to use these applications and Chinese companies to be leaders in this space.

Kaiser: And investors are usually very sensitive to these signals. And they’re aware of when the government has opened up a new [inaudible 0:21:35], a new kind of wind tunnel. And so they set their sails by those. And is that happening in this case? Is there a lot of investor money going into Generative AI startups?

Paul: It’s a good question. I think it’s also a tricky period, of course, generally in the Chinese economy in terms of investments and sort of economic growth. I think, ironically, the sort of frenzy in AI and generated AI is happening at a time of sort of economic downturn. So, I think right now it’s tricky because big companies like Baidu, Alibaba, Tencent, and Huawei have already had the resources to invest in these kinds of capabilities. And again, they’re very expensive to invest in. This depends on what you’re doing. But if you’re trying to develop your own large language model, you need a lot of computing, you need data, you need a lot of cloud interface, and you need a lot of money to spend to actually train a model.

I mean, I think the estimate was $300 million to train ChatGPT 4; I believe that is the figure. And so, we’re talking big bucks to actually play in the game of large language models. So, there’s already sort of a limitation in terms of the type of company that can do this. I think in China; there will be investments in companies with sort of niche pieces of the technology stack related to Generative Ai. But I don’t think that there’s been this huge, massive influx into the space yet. I think in the U.S., you do see a lot more interest, for example, in second-tier companies that are beyond the sort of big players. But they’re all very narrowly focused on certain areas like image generation, or in the case of a company like Cohere AI, for example, they’re very narrowly focused on business applications and not large line models for general purposes. It’s a good question on the startups. I just haven’t seen what I think is a huge flood, again, because some of these things are going to take big, big investments over long periods of time to play in space.

Kaiser: Real barriers to entry. Yeah, for sure. I mean, finally, I think we should get to U.S. policy and the export bans on advanced GPUs and other chips that are used in these large neuro networks. I guess I want to start with maybe you could sort of layout where we are right now in terms of regulation. I mean, I know that you’ve been keeping up on the other piece, which is an investment, outbound investment bans into certain key areas, including into Generative AI, I believe. So give us a press of the state of things right now in terms of U.S. policy.

Paul: Sure. I think U.S. policy, with respect to Generative AI, is coming down in sort of two big areas. So, the first is an attempt to target the sort of advanced hardware, which is, at the base of the technology stack, generating things like large language models. So that’s the semiconductor base. Usually, on top of that, you’re running models, in development environments, like PyTorch and TensorFlow. And then, on top of that, you’re running specific training algorithms to train the large language models, and then finally, you have some end-use. At the base of that are things like advanced GPUs, which turn out to be ideal for training large language models, sort of parallel processing. And so the U.S., last October, began this effort to sort of control the ability of Chinese companies to have access to those parts of the hardware stack, where U.S. companies frankly dominate, right?

And where there are not a lot of other alternatives. The line from the administration is that this is narrowly focused on national security. So, advances in AI are now considered very much in the national security domain. Now, it’s tricky because in this arena, unlike, for example, export controls on weapons of mass destruction, where it’s clear what the threat is, everybody understands that we want to control technologies that enable the development of nuclear weapons, AI is a little more nebulous here. What exactly is the sort of killer app, for example, that is going to change the military balance, whether it’s between U.S. and China or China and a Taiwan contingency, it’s really hard to define that. And so, in a sense, what the U.S. is trying to do is solve for the future potential of China, for example, or Russia or other countries to develop an advanced AI capability that somehow proves to be a game changer.

So, it’s a little bit novel to use in that sense of export controls. Now, GPUs and other advanced computing capabilities, for example, can be used in high-performance computing to accelerate, for example, their model development for weapon systems. And so, that’s a tangible use, but the vast majority of Generative AI is not used for that. So, Generative AI is sort of a different ball game. Generative AI, for the most part, the applications are civilian in nature, right? Drug discovery, better understanding of mapping data or environmental data. I mean, all sorts of applications that are not necessarily, clearly jump out at you as military. So, it’s tricky in the sense that the U.S. is pursuing these controls on the hardware with the long-term goal of preventing the emergence of AI in China as some sort of military force.

And then the other piece of it is an investment in those sectors, not just ai, but other things like semiconductors, of course, and quantum. And just today, before I got on this call, the U.S. government released this new draft executive order or executive order and a proposed rule that would control the investments by U.S. persons and U.S. companies, for example, venture capital and private equity into companies in China that are in these critical technology areas. AI, interestingly, is a little more undeveloped in terms of how those controls are going to work than semiconductors because there are existing export controls that the outbound investment will seek to align with, right? But AI is software. So, U.S. officials talk about not trying to control all software but very specific applications of AI, right? Like surveillance, facial recognition, and those kinds of areas where there’s a clear link to some activity that the U.S. considers of national security import. But again, they’re seeking feedback over the next three to six months around this rule to try to really keep it narrow and focused on military and sort of applications like surveillance that are considered to be important.

So, we’re sort of in a new area where AI is now part of the learning process of how the U.S. and other Western allies will control China’s access to this critical technology and try to do it in a narrow and targeted way. But that’s going to be difficult to keep narrow and targeted because of these other broader applications of ai that are mostly available.

Kaiser: That the sort of hardware foundations of this, as you say, is the same whether we’re talking about deep reinforcement learning or we’re talking about Generative Ai. I mean, it’s exactly the same gigantically networked bundles of GPUs, right? I mean, we’re talking trillions of connections at this point, right?

Paul: Right. We’re talking about 10,000 NVIDIA A100 GPUs, for example, that were used to train ChatGPT 4. And these are very specialized systems that require, again, lots of hardware knowledge and then are very expensive to run. But those GPUs tend to be particularly suited to large language models. The thing that I’m still trying to get my head around that is the U.S. government’s concern, specifically about generative AI. Which, again, on the surface of it, there’s no clear, obvious sort of military end-use. I mean, you can envision some, certainly, in different applications, but it’s still sort of an R&D phase, if you will. And it’s still very much developmental and is used primarily in China.

(29:54):

If you look at all the different companies, they’re very focused onโ€ฆ Think of coal mining, right? Huawei’s Pangu model is being used to help optimize coal mining, right? A very mundane but important industry, but not a military application.

Kaiser: Yeah. It’s tough. I mean, I think the assumption is that it democratizes access toโ€ฆ Anyone can code, right? Anyone who knows how to write a query and knows how to write prompts just unleashes a lot of power into a lot of hands. I suppose that’s a concern.

Paul: Yeah. It can certainly accelerate the development of software code, right? Absolutely. Absolutely. But again, traditionally, the export control system and even the inbound investment system were really focused on this narrow set of technologies with a much clearer link to sort of the military end-use, right? Particularly for weapons of mass destruction. I mean, the materials were very specific to the nuclear industry, for example, or the sensors were specific to missile technology. Now we’re sort of several steps back in the food chain here, looking at semiconductors which are themselves, not the problem, but what they could be used for, depending on the application and the software, etc. There are a lot more links in the chain to get to the actual military end-use.

And that presents a challenge for other countries to sort of align with the U.S. on the idea that denying the most advanced GPUs, for example, to China, is a way to sort of stave off a big national security problem. And a lot of countries are realigned on.

Kaiser: It seems the U.S. has already made up its mind, though. Anyway, it’s obviously something that is a gigantic concern to people in the Chinese AI research community. What are they doing by way of workarounds? Maybe we can first talk about the move that many of them have made now. I mean, I’m thinking specifically about Alibaba, Alibaba Cloud, and its take on Meta’s Llama system, basically. It’s called Tongyi Qianwen (้€šไน‰ๅƒ้—ฎ), an open-source large language model. Is that part of the workaround strategies that they’re going to be doing?

Paul: It’s a great question. I think there are lots of different pieces to the workaround strategy. One is to, for example, Bytedance, and others are trying to acquire large numbers of GPUs before the Commerce Department might change the requirements around the performance thresholds for the, for example, NVIDIA GPUs. So, Chinese companies can still buy A800 and H800s, which were the modified versions that NVIDIA put out after the October 7th controls. And so, Bytedance has put in a huge order for these things, right? So, one strategy is to let’s stock up as much as we can on the existing hardware. And then the second one will probably be acquiring these restricted semiconductors through lots of different channels, right? Getting around export controls using front companies, etc.

That is already probably going on because these areโ€ฆ In some cases, you can go on online and buy these, right? The bigger systems that incorporate dozens or hundreds of processors are going to be much harder to acquire. But smaller numbers can be acquired. And then the other approach is to use alternative indigenously developed capabilities. For example, Huawei is using its Ascend processors that it developed on its own. Those were originally manufactured at TSMC before Huawei was restricted from using TSMC. They’re probably going to be using domestic layers like SMIC to continue to manufacture those. But they advertise there, for example, their AI stack or Generative AI as including being trained on the Ascend processor. And they have enough of those, of the existing generation, to do a pretty good job of training their large language models.

And this is happening in the West too. Companies have their own processor; they’re optimized for their own hardware to train models, etc. Google’s doing this, of course, with products like TensorFlow. And Baidu has the same thing. Baidu has Kunlun, which is its processor. That’s one way around that bundling. Maybe larger numbers are not as efficient as having access to the latest stuff, but you can still do a pretty good job training these large language models. And then the third way is, interestingly, as you noted, Baidu and Alibaba are allowing Llama, this Meta open sourced model to be, to be accessed from their cloud services.

Kaiser: Oh, I see.

Paul: Now, this is interesting because I’m still wondering how the Chinese regulators will view this and whether that will be something that requires a license eventually. Because again, back to the regulation issue, CAC is going to come up with a licensing arrangement around these models so they have some visibility into what these models are doing.

Kaiser: So, it’s actually Llama. They’re accessing Llama through their cloud, not actually building a cloud-basedโ€ฆ

Paul: Yes, this is an open-source model. It’s this model as a service. So, the big players, particularly Baidu and Alibaba, are offering large language models as a service. So, again, with access to the cloud and API, you can access these models and then develop and use your own proprietary data user. These are private clouds or hybrid clouds. You can develop your own private application using your own proprietary data using one of these models. And so, in a sense, that’s a pretty important way of sort of a workaround because those models were developed by Meta using their large computing capabilities. And so you can access those to train them on your own data. It’s an important ability. And right now, I should say the U.S. government is probably thinking about how we control open sourcing of large language models like that.

Because that’s sort of a little bit of a loophole, right? Where Chinese companies don’t have to rely only on their own domestically developed models. But the gain now is for large cloud providers to provide access to these models, license access, help companies develop their own sort of enterprise model, and then charge for the service. That’s sort of how you make money off these things. Huawei, for example, has a set of large language models that are optimized for mining, as I mentioned earlier, meteorology, called Pangu weather; railway, monitoring railway traffic; and things like drugs โ€” drug discovery. So, each of the big players in China specializes in developing models that are optimized for particular industry verticals.

Kaiser: Yeah. And this is something that you had anticipated. I remember you had written something in Digital China right after the draft regs came out, and you had said that you anticipated that that would be the direction that Chinese companies would take. That they would focus on industry verticals rather than doing these big general-purpose agents like ChatGPT. So give me some examples. I mean, you’ve just mentioned a few that are very interesting โ€” mining, meteorology. What are some of the applications of Generative AI to specific databases? I mean, what would it look like? I mean, like you mentioned, meteorology, how would that play? I mean, what does it do in practice?

Paul: Well, in the case of, for example, Huawei’s Pangu weather, they claim that it outperforms sort of standard models in terms of accuracy for weather forecasting, right? So, you’re applying, yeah, it sort of speeds up the ability to do forecasting, predicting the trajectory of typhoons. I wonder if that predicted the trajectory of the recent typhoons that hit China in a big way. These models are now trained and can plow through data much faster. And so, for example, what took ten days, now using these models, they can get the predictability down to four or five hours. So, it’s sort of optimizing the ability to crunch through a lot of data, as with specific models that are optimized for a particular data set.

Baidu, for example, is another interesting case where they have the Ernie, sort of big model, which is sort of an analog of ChatGPT. And then they have also what they call the information distribution big model, transportation big model, and energy big model. They’re selling these now as a service. So, if you’re an energy company, you can come in, you might have a particular data center issue that you want to optimize, the use of energy in that data center, and you can use these large language models to crunch the proprietary data that you already have in that sector, and then develop sort of solutions that optimize something like energy use and data center. That’s the way that Chinese companies are thinking about doing that.

Kaiser: So, Paul, let me understand. Let me understand, so this means, I mean, I suppose part of it is just the ability to use natural language, just sort of ordinary language to prompt or to query, and to pull kind of useful information out of these unwieldy, unstructured datasets, right?

Paul: Yes.

Kaiser: Okay.

Paul: Right. Let me give you another good example, and this is coming from a large U.S. company, but this is the kind of thing that Chinese companies are doing too. They’re assembling and training a version of ChatGPT, for example, 4, on huge datasets related to weather data and rainfall data and sort of mapping data in a proprietary context. And then researchers, who may not be experts in this area, can go in and query using normal human fairly straightforward queries, and the model will generate, it’s a multimodal model, and it can generate text and images related to that specific query. And normally, these kinds of databases are only accessible to specialist researchers. But here now, you’re sort of opening the aperture in terms of who can access these kinds of datasets in a more sort of natural interrogation using the front end of something like ChatGPT that people are now more familiar with, as you’ve seen.

And you’ve seen companies now, there are whole companies that are coming up with, here’s optimized queries. Here are the types of queries you can use, for example, to optimize your output. I use this, actually just recently, I wrote an email, and I wrote it, and I asked ChatGPT, for example, to make it more deferential to the person I was writing the letter to, to make sure that I wasn’t too sort of passive-aggressive. And it did a great job of doing that.

Kaiser: Maybe use that on your tweets from now on, Paul.

Paul: Yeah, I’ll need to run my tweets through ChatGPT. Bard is actually my new favorite. Bard is very good because it’s up to date, whereas ChatGPT 4, the training data, ended in September 2021 or 2021.

Kaiser: Right. And so do Chinese companies, though. Do they lose out? I mean, I see the appeal of usingโ€ฆ I mean, for very many reasons, for focusing on discrete industry verticals, but don’t they lose out on something by not going after general-purpose tools? I mean, five years, ten years from now, are Chinese tech companies going to be at a huge disadvantage because they didn’t pursue general-purpose technologies?

Paul: It’s a good question, and I think it depends on how this evolves. I mean, in the West, in the sort of outside of China, the thinking is that at some point, these general models, they’ll sort of become your personal assistant, right? So, you’ll have your sort of individual model that’s sort of trained on your own data, understands you, and can be sort of optimized to respond to your needs. And again, the revenue model there is still not clear to me. I mean, I’m paying $20 a month for ChatGPT, and I use it all the time.

Kaiser: Yeah, me too.

Paul: But I think in the West, the general models will become important for sort of individual use and then also as a window into some of these more specialized areas, right? As people get used to interacting with these chat models, then if you need something that’s more detailed, you can call โ€ฆ they’ll be linked to other more narrow and detailed types of applications, like AlphaFold even. I mean, AlphaFold could be something that you’re doing research, and then all of a sudden, you might want to take a look at some scientific problem, and then you might use your sort of ChatGPT and your general interface to interact with that. So yes, in China, I think eventually but once the regulations are clear, right? I mean, again, I think that Chinese companies are being careful here not to get out in front of the regulators because this is such a potentially explosive political alignment problem for those companies.

So I think once CAC figures out how to license these things, and then as companies get more adept at this reinforcement learning through human feedback, which is how they put guardrails on this, U.S. companies are doing this too. I’m sure that Chinese companies like Baidu, in particular, that have the Ernie bot, which is really probably the most capable challenger of something like ChatGPT that has done very well in benchmarks, by the way, in comparison to ChatGPT, that will become something that eventually we’ll probably see wider use because there is a demand for that, right? When ChatGPT went up, a lot of people got U.S. phone numbers and VPNs and were accessing ChatGPT from China, even though OpenAI tried to restrict that geographic geofence. So, there’s a tremendous amount of interest in China in these kinds of capabilities, but because of the political alignment problem, the companies are trying to get down the sort of non-sensitive applications and models.

And then, eventually, I think that there’ll be a more general development of general models that are more usable by the public once the regulatory system is a little bit more firm. So, they don’t want to run afoul of things before that happens. And they’ll work closely with the regulators to try to make sure that they don’t get ahead of the game.

Kaiser: So, for the benefit of my listeners here, I mean, I know that during the course of this conversation, we’ve name-checked a lot of companies and talked about what they’re doing, about their general sort of strategy and some of the novel things that they’ve been pursuing. Maybe we can put it all in one place and do it a little more systematically and just kind of go through company by company. Let’s go. I mean, here are the companies. Let’s do Alibaba, obviously, Tencent, Baidu, Huawei, and Bytedance; I think we’ve named it. Maybe that’s a good enough start. Why don’t we start with Alibaba? We’ve already talked a little bit about how they’re making Llama, Meta’s Llama, available through the Alibaba Cloud. What else are they doing? I mean, they bring a lot of assets to this obviously. They have all that payment data; they have all that purchase data. So, what are they doing with that?

Paul: Yeah. I think it’s important in general to note that each of the companies that have aspirations in this space brings advantages. So Alibaba, as you rightly point out, brings, for example, lots of logistics data. The verticals that they’ll be focused on will likely include things like logistics, also, of course, payments and payment systems. And then other areas like they’ve been working, for example, already with some industry verticals as part of their Ali Cloud offering. And they also have Ali Cloud, right? They have a sort of advantage in terms of being a cloud services provider because that’s important in a way that some of the other players, like Bytedance, is not really a cloud services provider. They have a very specialized thing. In general and globally, companies that are developing models are partnering, of course, with cloud service providers.

That’s why Microsoft and OpenAI, for example, are partnered; and Google and Inflection or Anthropic, I’m sorry, Anthropic. So, companies are sort of playing to their strengths. Alibaba, again, also has things like its own DAMO Academy. And they’re doing a lot with developing a sort of broader hardware and software stack for AI development. And so Generative AI sort of fits into that model. Baidu, as I noted, has its advantages. It has 20 years of search data in Chinese, right? And so, they’re very much focused on theseโ€ฆ And that’s why Ernie Bot probably is so good because he’s trained in 98% Chinese language. And they claim that it has the unique ability to tease out insights in terms of the Chinese language.

Kaiser: That’s always Baidu’s line. Yeah.

Paul: Huawei is a little bit different in the sense that they’re so focused on the enterprise. They say things like we don’t write poems, right? So, this little classic comment from Huawei. So they’re very much focused already on these industry verticals. And that’s in part, of course, because they’ve been, since their controls on things like telecommunications, inputs, and semiconductors, they’ve been seeking out new markets. And so they’ve been looking at these industry verticals like automating port facilities and mining operations. And so, they’re very much focused on that area. But their cloud services also, again, are very robust. And so they’re offering this all through their cloud services. And they’re also trying to develop a sort of full stack, development stack. So the hardware, the development environment. MindSpore is, for example, Huawei’s equivalent to PyTorch and TensorFlow, which are these really critical development environments.

And MindSpore is also compatible with PyTorch. So, they’re trying to figure out how to get developers, for example, in that space to develop applications using tools that they’re familiar with. And then Baidu has PaddlePaddle, for example, which is its sort of equivalent development environment. The game sort of has this full stack of development tools from hardware to the development environment to large lineage models through this model as a service that we talked about earlier. And so each of the companies is offering these kinds of services and then optimizing those models for those areas where it has some advantage in the data space. And so, again, Baidu has this transportation model and has this big energy model, along with the Ernie Bot. And then Huawei has these very specific areas, meteorology, mining, railway, and drugs. They’ve just expanded into government finance and manufacturing. And they’re offering specialized versions of, for example, Pangu in those sectors.

And so that’s the trend you’re going to see. Bytedance, it’s not clear to me. Bytedance has a lot of data, obviously, from social media and from all the interactions, including a lot of videos. And so how they’re going to in the Generative AI trend here, they already have very well developed, of course, algorithms for recommending like-minded content. It’s not clear to me how they’re going to play in this in the same way as those other companies, which areโ€ฆ

Kaiser: Filters and things like that.

Paul: Yeah. I mean, I’m sure they’ll develop capabilities that further their business model using Generative AI. But these other, Huawei, Baidu, Tencent, and Alibaba, these are companies that are trying to figure out how to leverage their underlying datasets and their AI expertise in developing large language models to actually service clients in these industry verticals.

Kaiser: What about companies like Meituan? Are you aware of them doing much? And Alibaba competitors like Pingduoduo or JD.

Paul: Yeah. A lot of the big companies, Meituan is part of this, have also announced that they’re working on large language models. But the question for China is, and similar to the U.S., is how many individual models are sort of needed? Because you have a fairly large number ofโ€ฆ I think in China; it’s now 180 or something. There’s a large number if you look across all of the companies and some of the research institutes, Tsinghua, of course, has its own big large language model. And BAAI also has its WuDao, which is one of the first bigger models. If you look at them, though, they’re all sort of different. They have different training sets, datasets, and different ability to, to be deployed.

Part of the game, too, is that you want toโ€ฆ These big large language models they’re expensive to train, and then they’re also expensive to deploy in some cases too. So, there are smaller models, for example, that are being developed that are, again, sort of niche areas that aren’t as expensive to develop. Because this is not a game for people without some resources, both in terms of computing and sort of data center power, but also just for people who understand talent. So, there’s also sort of a lot of competition to recruit people who understand these models and how to deploy them in a sort of enterprise context, for example.

Kaiser: Yeah. Let’s move on and talk aboutโ€ฆ We’ve talked a lot about hardware and software, but let’s talk about talent, as you say. What’s your sense of how China’s faring in that regard? Because I thought a few years ago, it’s fair to say the consensus would’ve been that China was awash in kind of mid-level coders. But the really innovative AI scientists were pretty few and far between. But that seems to be changing too, at least to look at the names and the major peer-reviewed papers that are being published.

Paul: Yeah. That’s a tricky, tricky topic, sort of; how does China stack up on the talent side? You can look at things like the top 1% of cited papers, and that’s certainly an area where if you look at that over the last five years, Chinese researchers, working for companies and for key state-owned labs, for example, have really have moved up in the specific areas, particularly for things like image recognition. One of the major papers that all subsequent research on sort of image recognition references is a paper that’s in the top 1%, where all the researchers are from China, from different organizations. From that perspective, China has come from, say, ten years ago, to being really critical Chinese researchers in these key areas, but particularly for these niches like recognition and now natural language processing. And the quality of the education system, of course, in China is hugely improved.

And so when I talk to people in China, for example, who are recruiting software engineers to work on important AI projects, and at commercial companies, they’re blown away, for example, by the quality of the talent they’re seeing at second and third-tier universities in China. All the-

Kaiser: Second and third tier.

Paul: Yeah. So like everybody knows about Tsinghua and Beida. But we’re talking about Huazhong University, maybe in Wuhan, and universities like Zhongshan University In Guangzhou. Second, you know China’s education is very tiring, right? And everybody else that get into the five schools, but, but they’re recruiting at these second or third tier universities and are finding that the software talent is every bit as good as a top university. So, I think that speaks to the idea that over time, for example, China is going to have a big advantage in terms of personnel and sort of the talent pool that’s going to be available to help drive forward innovation in this sector, including in Generative AI.

In the U.S., of course, universities remain extremely desirable for foreign students to come and study at. And companies like Google, of course, and DeepMind, AWS, and Microsoft are also hiring large amounts of really capable people. I think that the challenge will be as the U.S.-China relations get tenser and that we avoid some sort of bifurcation in the AI space. Because, as you know, there’s been so much collaboration in AI. And a lot of the leading Chinese AI scientists, for example, and company presidents came out of things like Microsoft Research Asia, where they were trained, or they went to university in the U.S., and then they went back to China. So, there’s been a tremendous amount of cross-fertilization between the sectors and the education systems on AI. This has benefited everybody. And so it’s one of the things we want to try to, I think, avoid decoupling, for example, in that sector.

Kaiser: That’s right. And that’s actually what I wanted to ask you about to wrap this up. We are seeing this sort of movement toward a bifurcation, as unfortunate it is as it is. Maybe, Paul, you could spell out what some of the other costs of that are, not just to this wonderful period of collaboration and cross-pollination. But what dangers are posed by the kind of decoupled world of parallel AI platforms? What are we losing as we move toward that?

Paul: It’s a great question. Well, I think it fundamentally, I think that the sort of free freer flow of personnel and back and forth between Chinese companies, U.S. companies, researchers, has really been of such huge benefit to the whole sector that any sort of separation there is going to be tough, for example, for U.S. companies if they can’t recruit and retain leading software engineers from China, right? And so, yeah, U.S. immigration law, for example, is so in need of reform, H1B, all, all these areas that have been sort of neglected. And then, of course, we have this sort of a more hostile environment in the U.S. caused by things like the China Initiative. And so over time, if the U.S. becomes a more hostile environment for AI researchers, and we saw during the pandemic, for example, people unwilling to travel to the U.S., even after the pandemic, some of the conferences that were held in the U.S., leading AI conferences, much less attendance from China.

And if you remember this, four years ago, one of the leading conferences, NeurIPS, it was held during the Chinese New Year, and they moved it because the attendance of Chinese researchers was considered so important. It’s hard to imagine that happening now, unfortunately, right? Because now, there’s a reluctance on both sides, I think, to sort of pursue some of the collaboration that’s happened in the past. That’s one thing. And then the other thing is all this is happening within a huge debate about how to regulate AI in the EU, in the U.S., and of course, in China, as we’ve talked about. And so there is a general sense that there needs to be some, ideally some, international global cooperation on this around things like standards for watermarking content, right?

I mean, if we get into a world where we have a very different regulatory environment in China versus the rest of the world, and Chinese companies are leading the development in some of these areas, it’s going to be really suboptimal in terms of attempting to regulate AI and the spread of AI more globally into the global south and other places, other applications, right? So, I think there’s a huge need to include Chinese companies and Chinese regulatory authorities in the global debate about how to regulate this industry. Because other than the U.S. and Canada’s big players, and you have some other small players, China’s, by far, the biggest player in this space globally.

Kaiser: Absolutely.

Paul: If you’re going to leave out half the world on AI, that’s bad. From a U.S. national security perspective, for example, you don’t want to lose insight into what’s happening in China. And so China is such a black box on AI. That would be really bad. And then the surprises that people are worried about would become more likely because the communities wouldn’t be exchanging things. Right now, it’s hard to see a surprise in AI that, for example, is a huge military benefit because everybody is sort ofโ€ฆ There’s a lot of interaction, and people know where things are developed, and nobody’s going off in secret and isolated from the broader community and trying to develop some AI capability that’s nefarious. Right now, it’s sort of out in the open. People are publishing. But as people stop publishing because of these sensitivities, for example, and people stop making things available, open source for example, then you’re going to get into a very scary world where nobody knows what the other side is doing, and then that can lead to bad outcomes.

Kaiser: It’s really an argument against walling off from a national security perspective. And we do need to take the national security concerns very seriously. Thank you so much, Paul. I mean, it’s great that you could join me again and take time out of what was a very, very busy day for you. Let’s move on to recommendations. First, a quick couple of reminders. First, don’t forget that our next China conference is in New York on November 2nd. It’s an amazing event space. It’s going to be a delight. You can get tickets now. We’ve got amazing speakers lined up, just deep and very thin panels. Yasheng Huang is going to be keynoting; it’s going to be amazing. There are going to be some deep-dive breakout sessions on very, very important topics, including a couple of technology-related topics. We’re even going to do a game show, which is going to be just splendid.

I’m going to be hosting that at the end of the day. I am also going to be taping a cynical live along with Jeremy on the evening of November 1st in New York. So, if you get your VIP tickets for that, you can attend that as well. Sign up for that and make sure to do that. That’ll be fun. I’ll look forward to seeing a lot of you in New York. And if you can’t come to our conference but you still want to support the work that we do, as always, please take a moment and become an access member of The China Project. You get our daily Dispatch, this excellent newsletter. You get early access to Sinica, most weeks, notโ€ฆ Sorry, I’ve been lapsing a little bit recently because I’ve been on holiday and much more. It’s all for the cost of what? Like three or four cups of coffee a month. All right, onto recommendations. Paul, what do you have for us, man?

Paul: Okay, well, I got a couple of things. One is sort of along the lines we’ve been talking about. I think The Alignment Problem, which is by Brian Christian, talks about machine learning and human values and talks about the sort of broader issue of how to align the development of AI with human development. I think it’s a very thoughtful treatment of the topic. And really is-

Kaiser: It’s a book or?

Paul:It’s pretty amazing. I also recommend a trilogy of books actually by Larry McMurtry, A Lonesome Dove trilogy.

Kaiser: I love it. I love it.

Paul:I guess actually a trilogy, and a particular, Comanche Moon, which is, I think, the second book, which is an amazing look at the sort of American southwest in the last century or, I guess, in the 19th century, and just gives you an amazing-

Kaiser: I absolutely love Lonesome Dove. I haven’t read the others in the trilogy

Paul: Yeah, they made a film of it, but I think the books are really rich in terms of โ€ฆ along with Cormac McCarthy, which is the other one I recommend, Blood Meridian, of course, which is sort of a similar line here.

Kaiser: My favorite novel.

Paul: Another novel; it’s amazing.

Kaiser: My favorite novel of all time. Yeah.

Paul: I think I’ve just been, lately, interested in that period of U.S. history, which I think is very important.

Kaiser: Oh, that’s amazing. Yeah. There are some good Westerns that have shown up on streaming services recently. I saw one called Hostiles, which is pretty good.

Paul: Yeah, I think the Western genre has sort of returned to some level of interest, but I think there’s been new research too that’s gone on about a period and how the U.S. dealt with things like large, the large Indian populations in the West and how they interacted, which I think were really quite interesting. And so, I think that’s an area-

Kaiser: Excellent.

Paul: So, lots of tech. I could recommend a lot more other tech books, but I won’t do that since we talked so much about AI. Yeah.

Kaiser: Well, you want; again, you can recommend some more. Excellent. My recommendation for the week is just the third and final season, actually the whole show, but the third season of the HBO Show, The Righteous Gemstones, just concluded. It stars the inimitable John Goodman as a megachurch minister with three truly awful adult children. So, it’s sort of like a more farcical, over-the-top succession because that’s also the issue sort of who succeeds him at the head of the church. It’s, it’s very, very funny. You might think that maybe bible thumping charlatans are too easy a target for really effective satire, but this is really funny. It’ll keep surprising you. It’s actually quite inventive. Walton Goggins stars in a bunch of episodes in the last two seasons of it. Now, he’s always โ€ฆ anything he touches is just great. So yeah, a show for you.

And the other one I would recommend is Hulu; they have the sort of reboot of one of my favorite shows of old Justified. It’s calledโ€ฆ It’s something written originally, a character created by Elmore Leonard and taken from some of his short stories Fire in the Hole and this one called City Primeval. So this stars Timothy Olyphant, who’s just amazing. And so, he’s so cool and soโ€ฆ He’s sly and cool and charming. He is just such a winning character in Raylan Givens, the character that he plays. So yeah, check those shows out. The Righteous Gemstones and Justified: City Prime Evil.

Paul: Cool.

Kaiser: All right, Paul. Great to have you on again, man.

Paul: My pleasure, Kaiser. I always enjoy our discussions. I commend the podcast for doing such a good job. I’m telling you it’s worth it.

Kaiser: Thanks so much, man.

The Sinica Podcast is powered by The China Project and is a proud part of the Sinica Network. Our show is produced and edited by me, Kaiser Kuo. We would be delighted if you would drop us an email at sinica@thechinaproject.com or just give us a rating and a review on Apple Podcasts, as this really does help people discover the show. Meanwhile, follow us on Xitter, as it’s now called, or on Facebook at @thechinaproj, and be sure to check out all of the shows on the Sinica Network. Thanks for listening, and we’ll see you next week. Take care.