How Chinese Open Source Took the Lead in Global AI Usage
For two years, the best AI came out of San Francisco. Then developers followed the cost curve, and it led to China. Chinese open-source models now do more of the world's actual work than American ones, and the reason says everything about where the agent economy is heading.
For most of 2025, the story of AI was an American one. The best models, the biggest launches, the loudest valuations all came out of San Francisco. Then developers started doing the thing developers always do. They followed the cost curve. And the cost curve led to China.
In the week of February 9 to 15, 2026, Chinese-developed models passed American ones in token usage on OpenRouter, the largest platform for routing requests across hundreds of AI models, for the first time. Chinese models handled 4.12 trillion tokens that week against 2.94 trillion for US models. A year earlier, American models had accounted for nearly 70 percent of the platform's top-ten usage and Chinese models for under 20 percent. The overtake was not a fluke. By the following week the Chinese share had climbed to 5.16 trillion while the US number fell. Months later, the lead has not just held, it has widened.
It is worth being precise about what this measures, because the headline numbers get stretched. Token usage is not the same as users, revenue, or capability, and OpenRouter is one platform, not the whole market. The often-quoted "61 percent" figure refers to the top ten most-used models in a single week, not all 400-plus models on the platform. Longer-horizon data puts the Chinese open-weight share lower. But the direction is unambiguous, and token volume matters precisely because it reflects real deployment. It is what gets used when someone is paying by the request and watching the bill.
The simplest explanation is the right one. Chinese models are dramatically cheaper. DeepSeek's V3.2 charges around $0.42 per million output tokens. The comparable figure for a flagship American model can run into the tens of dollars, a difference of 10 to 20 times depending on what you compare. For a casual chat, that gap is invisible. For anything running at scale, it is the whole decision.
What changed in 2026 is that scale arrived. The defining shift on OpenRouter was the move from chat to agents. Programming workloads went from roughly 11 percent of the platform's token volume in early 2025 to more than half. Agentic tasks, where a model is called over and over to plan, write, test, and revise, now make up the majority of output. A single overnight coding run can invoke a model thousands of times. When that is your usage pattern, per-token price stops being a line item and becomes the budget.
Chinese labs were positioned for exactly this moment. Models like MiniMax's M2.5 and Moonshot's Kimi K2.5 were built natively for agent workflows, and they post benchmark scores within a hair of the American leaders. On SWE-Bench Verified, a standard coding test, M2.5 scores around 80.2 percent against roughly 80.8 percent for Claude. Near parity, at a fraction of the cost. For a developer pointing an autonomous agent at a problem and paying for every step, that is not a close call.
The cost advantage itself traces back to a few structural things: cheaper energy, heavy state investment in power, and more efficient model architectures. Some of that efficiency was forced. US export controls on advanced chips pushed Chinese labs to squeeze more out of less compute, and the result is a generation of models engineered around doing more with fewer resources.
Step back from the leaderboard and the bigger pattern comes into view. The entities driving this usage are increasingly not people typing into chat windows. They are software agents, calling models autonomously, transacting at machine speed, running through tokens in volumes no human conversation would ever generate.
That shift has a consequence the token charts only hint at. If the agent economy is the future, its supply side, the raw intelligence that agents buy and act on, is tilting toward open-source models, and many of the cheapest, most-used of those are now Chinese. The machines coming online need answers, and they are increasingly sourcing them from open weights rather than premium proprietary APIs.
This is already producing a second layer of infrastructure. Protocols are being built on the bet that autonomous agents will need to buy verified answers from a marketplace of many models rather than commit to a single vendor. Telegraph, a messaging protocol on Base, is one example, designed to route agent requests to whichever model scores highest for a given task and settle payment automatically. Whether any specific project wins is beside the point. The fact that this layer is being built at all tells you where the market thinks demand is heading: toward a world where most of the parties buying intelligence are not human.
None of this means China has won AI, and anyone selling that headline is overreaching. Usage is not the same as dominance, the US still holds an edge on some frontier models, and by several measures the capability gap has narrowed to the point of effectively closing rather than the US pulling away. Cost leadership can also evaporate the moment the economics shift, and Chinese models carry their own complications, including the legal question of data shared with a foreign government that matters a great deal for anyone routing sensitive workloads through them.
There is also a quieter worry, and it comes from the people closest to the open-source movement rather than its critics. The builder who writes under the handle 0xSero, whose open-source manifesto reached a large audience earlier this year, has argued in a follow-up that the open-weight window may be cresting rather than opening. The Qwen team that pushed local inference so far forward has moved on from Alibaba, open source a lower priority there now. Some labs that planned open releases reconsidered. Meta stopped putting out open weights altogether. And the next generation of inference hardware, sold in racks costing millions and back-ordered for years, threatens to hand the cost advantage back to whoever can afford to buy in at scale. His point is uncomfortable precisely because he wants open source to win: leading in usage today is not the same as being safe tomorrow.
That is the real stakes underneath the token charts. The "build it and they will come" assumption that underwrote American AI for two years has cracked, because developers turned out to be loyal to one thing, which is what works at a price they can afford. Open-source models proved that capability is no longer something you can lock behind a premium API, because someone will replicate it, ship it cheaper, and let the world fork it. That has always been the open-source bet, and in 2026 it is showing up in the only metric that reflects what people actually run. The open question is whether the conditions that made it possible, cheap compute, cheap inference, labs willing to publish their weights, hold long enough to matter.
The models doing the most work in the world right now are open, cheap, and increasingly Chinese. The interesting question is not whether that lead holds week to week. It is what gets built on top of a world where the cheapest intelligence is also the most used, where most of the things using it are no longer people, and whether the door stays open long enough for the rest of us to walk through.