Why Chinese AI Now Costs 30 Times Less Than American Models

Why Chinese AI Now Costs 30 Times Less Than American Models

Two of China's most capable AI labs cut their model prices to a fraction of Western rivals this week, while OpenAI and Anthropic moved the other way.

Key Points:

  • DeepSeek made its 75% V4-Pro discount permanent on May 22, fixing output at $0.87 per million tokens.
  • Xiaomi cut MiMo-V2.5 prices by up to 99% on May 26, with cached Pro inputs as low as $0.0036 per million tokens.
  • OpenAI lifted GPT-5.5 output to $30 per million tokens, widening the gap with Chinese frontier models.

DeepSeek, Xiaomi Cut Rates

DeepSeek confirmed on May 22 that a temporary 75% discount on its V4-Pro model would become permanent, fixing output at $0.87 per million tokens and input at $0.435.

The promotion had been set to expire May 31.

Days later, fellow Chinese lab Xiaomi slashed MiMo-V2.5 rates by up to 99% for cached inputs, effective May 27, with the Pro tier's cache hits priced as low as $0.0036 per million tokens.

By contrast, GPT-5.5 from OpenAI doubled its predecessor's output rate to $30 per million tokens. Claude Opus 4.7 from Anthropic lists $5 input and $25 output.

Also Read: Anthropic Moves Restricted Claude Mythos Model Closer To Public Release

Engineers Defend the Math

Xiaomi also rebuilt its token plans. The $100 Max plan now grants 82 billion tokens, up from 1.6 billion, with the same money buying five to eight times more usage than before.

Fuli Luo, who leads Xiaomi's MiMo team and once co-built DeepSeek-V2, tied the cuts to a smarter way of storing and reusing data the model has already processed.

That approach trims computing demand sharply.

Luo argued the lab can run near full capacity at the new rates and still cover its costs, which suggests the pricing reflects real efficiency gains rather than a loss-leading promotion.

The savings matter most for production tasks that reuse the same context. Agent pipelines with stable prompts, document processors, and retrieval tools all hit cache constantly, so cheaper cached input cuts the running bill directly.

Western labs face a different bind. OpenAI's pivot toward consumer features and advertising hints that token revenue alone may not carry its valuation.

Why the Gap Keeps Widening

DeepSeek and Xiaomi did not open this contest. Chinese models already undercut American rivals before either announcement landed.

MiniMax M2.7 runs at $0.30 input and $1.20 output per million tokens. Kimi K2.5 from Moonshot AI sits at $0.60 and $2.50.

Analysts tracking cost against benchmark performance peg the Q2 2026 price-to-quality gap between Chinese and American frontier models at roughly 15 to 30 times, before any cache discounts. This week's reductions narrow that gap further for the repetitive workloads that dominate real deployments.

The pattern echoes early 2025, when DeepSeek's low-cost releases rattled markets and forced Western providers to defend their pricing. A year on, the pressure has only intensified, and the response from American labs has been to hold or raise rates rather than chase the floor.

Read Next: Ethereum Network Empties Out As Staking Locks A Record 32% Of Supply

Disclaimer and Risk Warning: The information provided in this article is for educational and informational purposes only and is based on the author's opinion. It does not constitute financial, investment, legal, or tax advice. Cryptocurrency assets are highly volatile and subject to high risk, including the risk of losing all or a substantial amount of your investment. Trading or holding crypto assets may not be suitable for all investors. The views expressed in this article are solely those of the author(s) and do not represent the official policy or position of Yellow, its founders, or its executives. Always conduct your own thorough research (D.Y.O.R.) and consult a licensed financial professional before making any investment decision.
Latest News
Show All News
Why Chinese AI Now Costs 30 Times Less Than American Models | Yellow.com