Codex's 5 Million User Giveaway Slammed as "Stunt"! Claude Code Devours Nearly 90% of Tokens, OpenAI Loses Users Over Being "Stingy"?

Author | Chu Xingjuan

“Play around, joke around, but don’t mess with my tokens～”

“All paid ChatGPT subscribers’ Codex usage limits have been reset. Your weekly and hourly limits should both be back to 100%. Go make something awesome with your tokens today and have fun.” OpenAI Codex lead Tibo posted.

Reportedly, this move celebrates Codex surpassing 5 million users. This means ChatGPT paid users who recently hit their Codex usage caps can regain full limits and continue using Codex. Some users confirmed their weekly limit had been restored from about 60% to 99%, indicating the reset has taken effect on some accounts. Previously, ChatGPT Plus, Pro, and other users had to purchase additional credits after hitting limits.

However, user reactions to OpenAI’s reset “benefit” have been mixed.

Some users welcomed the reset: “I was really trying hard to use up my tokens, haha.” Another said, “Your reset post gave me the push to finally try fast mode. I hadn’t used it before, worried about burning through my quota too fast. But after this reset, I easily used it for real eval and review-agent work. Now, fresh tokens have given me an overdose of dopamine.”

But more people felt: “In reality, it didn’t bring any real benefit to most people—just a superficial reset.”

For many, the new week had barely started, so they hadn’t used many tokens, making the reset wasteful. Many users noted their regular weekly reset had just happened, so this extra reset offered little practical gain.

Some users asked if the current rate was 5x, and whether the $100 subscription tier no longer had double usage. Replies said that, based on Pacific Time, double limits would continue until the end of the day.

Another user said they actually lost out in this supposed “benefit.” “My weekly limit was supposed to reset on the 6th, and I hadn’t used any tokens except opening the app to check the reset date. Now my weekly reset date has been pushed to the 7th. If a user’s limit was still at 100%, this blanket reset shouldn’t have pushed their reset date back,” said Jimmni.

In response, a netizen explained, “In a window less than 7 days, you can’t truly be at 100% limit. You need to send at least one prompt/message to start the weekly metering cycle, which drops your limit below 100% (though rounding may still show 100% on the interface).”

Jimmni replied that his experience was that just opening the app would set the 5-hour limit to 99% and weekly limit to 100%, enough to “lock in” the reset day without sending a prompt. He concluded, “For me, I lost a day. This reset was entirely negative for me.”

A blanket reset can ease pressure for high-frequency users, but the timing, double-limit duration, tier differences, and weekly window calculation remain user concerns that are hard to unify.

1 Claude Code Consumes 8x More Tokens Than Codex

“Use Claude before Americans wake up; switch to Codex when US users come online and Claude slows down.” Anthropic has been using users to adjust tools and pricing. Codex hit 5 million users, but Claude Code remains developers’ first choice.

AI cost management platform CostHawk released an “Anonymous AI Tool Leaderboard,” using token consumption to measure developer usage intensity of AI coding tools. The leaderboard covers Claude Code, OpenAI Codex, Cursor, and others, showing anonymous aliases with only token counts, model names, timestamps, and hashed project IDs—no real names, accounts, emails, or prompts/responses/source code.

The page shows CostHawk tracks 100 operators with total token consumption of 415.9 billion. The top user, LunarCircuit, used about 52.5 billion tokens; the top 1% contributed 12.6% of total tokens, showing high-intensity AI coding users significantly drive overall usage.

By tool share, Claude Code dominates: it consumed about 369.7 billion tokens (88.9%), Codex 46.2 billion (11.1%), and Cursor about 770,000 tokens (near 0%). Among the 100 operators, 96 used Claude Code, 43 used Codex, and 2 used Cursor.

Notably, Codex and Claude Code’s different designs lead to significant token consumption differences for the same task. In a standard Figma integration task with identical prompts, codebase, and target output, Codex consumed about 72,000 tokens while Claude Code consumed about 235,000 tokens—roughly 3x more.

The leaderboard further distinguishes user types: 56% use only Claude Code, 40% use multiple tools, 3% use only Codex, and 1% use only Cursor. This means among high-intensity AI coding users, Claude Code remains the main entry, but a significant portion already uses multiple tools.

In growth trends, Cursor is growing 1.3x faster than Codex this month: Cursor grew 100%, Codex 79.1%, and Claude Code 0.1%. However, Cursor’s small base means its high growth is an early expansion signal, not yet changing overall token share.

“Switching from Claude Code to OpenAI Codex” isn’t a new topic.

A month ago, a Reddit developer asked about the experience migrating from Claude Code to Codex. The poster said they had used Claude Code Max x20 for months, combined with Serena, MCP, GSD1, etc., across multiple projects, but often hit session limits.

Under the post, some developers have already made Codex their primary tool. A user with 30+ years of experience said Codex “gets the job done” and writes more maintainable code; Claude sometimes requires repeated corrections, especially on frontend tasks. However, he cautioned that AI tools don’t take long-term responsibility for codebases—developers remain responsible.

Another engineer with 20 years of experience at FAANG said they switch between Claude Code and Codex, especially when limits are tight. They find Codex their first choice in the latest version due to faster feedback and better reasoning, but still prefer to keep both and have them “adversarially” review each other during complex design and planning. Some users explicitly plan to cancel Claude subscriptions and rely entirely on Codex.

However, many users believe Claude Code still has advantages in planning mode and multi-agent workflows. One developer’s workflow: use Claude Opus for planning, Codex for execution, then back to Claude for cleanup and optimization. They think GPT 5.5 still misses many things, and Claude remains irreplaceable for understanding intent and overall design.

Limits and pricing are key drivers for migration. A developer who previously used almost only Claude Code said the main reason for switching was usage limits. Even on the $100/month plan, they could exhaust weekly tokens within days. They think Claude is better at understanding goals, but Codex is better at strictly following instructions. They also noted Codex’s app experience isn’t ideal, mainly using Codex CLI, and plugins/skills built for Claude can’t be migrated without rewriting.

The user also mentioned that OpenAI’s core chat interface doesn’t count toward the same 5-hour window, so they can use the web for initial planning or small coding tasks; Claude’s web chat consumes the same subscription window, making Claude more restrictive in heavy development. Commenters agreed this is one reason Claude struggles to compete with OpenAI.

Meanwhile, some users cautioned that frequent limit exhaustion may not be tool-related but due to poor workflow management. One developer noted that if you’re using Claude Code across 3-5 projects, you should first improve session discipline rather than simply switching tools. Long histories, aimless codebase browsing, and excessive plugin stacking burn many tokens. Even with Codex, if you don’t control project sessions and task slicing, you might just “buy the same quagmire from a different vendor.”

2 AI Coding: 10x Price Gap Between Individuals and Teams

With the rapid development of AI coding tools, their business models are converging.

Both Claude Code and Codex adopt a low-barrier entry with separate charges for heavy usage, forming nearly identical subscription tiers: individual entry at about $20/month, and a premium tier for high-frequency professional developers at $200/month.

Based on public discussions and product info, 80%-90% of users are well below entry-tier limits, while the top 5%-10% of heavy users contribute most inference load. For vendors, continuing with overly loose fixed pricing would mean light users subsidizing heavy users long-term, leading to runaway inference costs. Thus, the $200 tier separates high-consumption developers, making truly frequent, professional, AI-dependent users pay for higher compute, while avoiding ordinary users paying for resources they don’t use.

Rate limits further reinforce this stratification. AI coding tools typically set usage caps within time windows, resetting every few hours. When developers frequently hit limits during projects, they tend to upgrade to higher tiers rather than switch tools. For users who have embedded Claude Code or Codex into daily workflows, habits and workflows themselves raise switching costs.

Inference cost is another key reason for this pricing structure. Frontier models are expensive to run, especially with complex reasoning, tool calls, and code execution. Heavy users’ actual compute consumption may far exceed subscription prices. Some analysis suggests that the usage provided by Claude Code Max at $200/month, if priced at pay-as-you-go API token rates, could cost over $1,000.

For vendors, a $200 fixed subscription brings stable high-value user revenue and hedges against inference load fluctuations. OpenAI’s ChatGPT Pro subscription was reported to generate significant annualized revenue growth within months; Anthropic’s Max tier is also seen as a direct response to heavy developer demand and cost structures.

This pricing is clearly attractive to AI companies. The $20 tier lowers the trial barrier, expands the user base, and collects usage data; the $200 tier captures professional users who derive higher business value from the tool and are more willing to have it reimbursed by their company. Compared to unpredictable per-token billing, subscription revenue also helps vendors plan GPU clusters, inference resources, and R&D budgets.

This trend isn’t limited to Anthropic and OpenAI. Cursor, Replit, and other AI IDEs and coding platforms are showing similar tiered pricing signs. The underlying logic is the same: AI coding tool usage varies greatly, with heavy users’ inference costs far exceeding ordinary users, forcing vendors to adopt tiered pricing for sustainable business models.

However, details may differ: Codex is trying to turn AI coding into a measurable, auditable token economy, while Claude Code emphasizes locking developers into daily use through a unified Claude workspace.

From a product positioning perspective, Codex is shifting from a “ChatGPT subscription add-on” to a “subscription limit + tokenized billing” model. OpenAI first integrated Codex into ChatGPT Plus, Pro, Business, Enterprise, and Edu plans, leveraging ChatGPT’s existing user base for reach; then monetized heavy developer usage through Codex credits and token rates.

In early April, OpenAI changed Codex pricing from “average deduction per message/PR” to aligning with API token usage: charging credits per million input tokens, cached input tokens, and output tokens. Different models have different rates, with output tokens typically much more expensive. This means developers’ long contexts, multi-round fixes, long outputs, and code reviews are broken down into finer-grained token costs.

More noteworthy is OpenAI’s promotional strategy. Currently, OpenAI offers 2x Codex usage for the Pro $100 tier until May 31, 2026, temporarily raising the standard 5x to 10x; the Pro $200 tier temporarily maintains the 5-hour Codex limit at 25x Plus on top of 20x Plus. This design uses the $100 tier to directly counter Claude Max 5x, while the $200 tier stabilizes truly heavy users, reducing their likelihood of switching to Claude Code due to limit anxiety.

Anthropic’s Consumption Less Transparent, More “Enterprise-Customized”

In contrast, Anthropic unifies Claude, Claude Code, Claude Desktop, and other entry points under the same subscription budget. Claude Code’s strategy is more about making Claude an all-day developer workspace rather than selling a standalone coding product.

This makes Claude Code’s business value come not only from the coding tool itself but also from locking developers into daily workflows. Once users make Claude Code their primary tool, overall subscription stickiness for Claude in chat, documents, code, and analysis increases.

However, compared to Codex, which breaks down local messages, cloud tasks, code reviews, model windows, and credits more clearly, Claude Code’s limit consumption is harder for users to precisely judge.

Recently, Claude Mythos’ pricing of $25 per million input tokens and $125 per million output tokens sparked community discussion.

“At this price, the cost of a single deep reasoning session with Mythos could equal a whole month of using Claude Sonnet. This price point will fundamentally change the economics of any startup relying on long-context reasoning,” said one netizen.

Many netizens believe this high-price strategy is further widening the gap between individual and enterprise users, showing Anthropic is targeting high-end capabilities more at infrastructure layers and production environments than individual developers.

Some users called it “a new tier for rich tech enthusiasts,” joking that a “$1,000 plan” might appear. Comments like “not for the poor” reflect the community’s perception of high-end model pricing. One developer joked that for lightweight code modifications at 2 a.m., entry-level models suffice; “they can charge whatever they want.” Another said the price is indeed high, but they still use budget models for nighttime coding, saving money for weekend trips, while wondering if high-end models truly deliver value matching the price.

However, some users find the price not entirely unacceptable. One netizen noted that some previous high-end models had higher API call prices, so the current price isn’t extreme in the high-end reasoning model market. Others said that if they already have the highest subscription tier, they might afford such model usage costs.

“If output prices reach $125 per million tokens, this pricing architecture targets not individual users but infrastructure-layer clients,” a developer pointed out.

This discussion again highlights model pricing patterns: low-cost lightweight models may continue to serve daily use and personal development, while high-priced frontier models serve high-value workflows, enterprise production environments, and infrastructure-layer calls. The subscription plan list above may be just the beginning; more options may appear, and developers may become increasingly confused by various billing schemes.

3 Diverging Paths of AI Coding Tools

Beyond pricing strategies, AI coding tools are also diverging in their approaches.

Developer “Theo - t3․gg” believes Claude Code emphasizes experience and emotion, Codex focuses on efficiency and verification, and Cursor bets on cloud workflows. These three products represent three different paths; the real difference isn’t “which is smarter” but how each team understands “how software will be built in the future.”

Claude Code’s biggest feature is entering from the terminal, not requiring developers to switch IDEs, install new apps, or migrate to cloud environments. Its advantage is “standing where developers already are,” integrating directly into existing workflows via CLI.

This path has quickly gained developer acceptance. Cursor once held a strong mindshare in AI coding tools, but Claude Code has now taken that position. Among some entrepreneurs and developers, many who heavily used Cursor have clearly shifted to Claude Code.

However, Theo - t3․gg also points out that Claude Code’s other side is strong “experience design” and “marketing attributes.” He believes Claude Code is not just a development tool but also a window for Anthropic to showcase “building AI applications with Anthropic models.” Its sub-agent, pet mode, terminal animations, token counting, and loading states all reinforce a sense of “lots of things happening,” making it highly shareable on X/Twitter.

In his view, Claude Code’s underlying philosophy can be summarized as: if more tokens can solve the problem, use more tokens. For example, using sub-agents to check projects in parallel, using many agents to audit code, and having the model execute more operations in the terminal. This often makes users feel “very productive” but may also bring higher token consumption and cost pressure.

Compared to Claude Code, Codex has a completely different product temperament. Codex’s interface is more restrained; tasks run without many animations, counters, or multi-agent displays—just a simple working status, timer, and task output. Theo - t3․gg says Codex “doesn’t try to be as addictive as a slot machine” but focuses on getting things done.

He repeatedly mentions that OpenAI’s Codex cares more about real engineering problems than social media virality. For example, Codex supports continuing to use the computer while the Mac is locked, new diff marker settings, and sending the current app screen to Codex as context via shortcuts. These features aren’t screenshot-friendly but genuinely improve engineering efficiency.

Theo - t3․gg particularly emphasizes Codex’s computer use capability. As model capabilities improve, Codex can actually view the results after modifying code, verifying whether the modification succeeded, rather than relying solely on the model to “imagine” if the code is correct. He believes this represents OpenAI’s core idea: not using more tokens for repeated checks, but using better environments and verification methods to achieve more reliable results with fewer tokens.

For Cursor, Theo - t3․gg believes its true strength is underestimated. Cursor was once the first mindshare in AI coding tools, but with Claude Code’s rise, many see Cursor as “falling to third place.” He thinks this is because many still view Cursor only as an IDE, without seeing Cursor Cloud’s capabilities.

In Theo - t3․gg’s view, Cursor’s cloud agent is the closest to the future among the three. Cursor Cloud doesn’t just provide a simple headless Linux sandbox; it can launch a full graphical Linux environment, run real applications, and test modifications via computer use.

This enables Cursor to handle more team-level and enterprise-level tasks. For example, if someone raises a product issue in Slack, team members can directly @Cursor bot to launch an agent to fix the issue and return a video proof of the fix in the same thread. He believes this “initiate task from collaboration tool, return verifiable result” workflow is currently hard for Claude Code and Codex to achieve.

Thus, Theo - t3․gg positions the three as bets on different time scales: Codex bets on the present, solving how to make today’s agents write code more reliably; Claude Code bets on model capabilities a few months out, believing models will become smart enough not to always need to run code; Cursor bets on a more distant future where developers no longer run agents primarily on local machines but trigger tasks via Slack, browsers, and cloud environments.

Additionally, the three companies differ significantly in openness and ecosystem strategy: OpenAI is more willing to provide buildable underlying capabilities. For example, Codex CLI’s app server provides a foundation for third-party agentic coding applications, allowing developers to build their own tools. In contrast, Anthropic prefers users stay within Claude Code’s own UI and CLI experience, embedding integration depth into Claude Code rather than allowing external tools to call programmatically. Cursor intends to open SDK and agent capabilities, but related priorities and maturity are still lacking.

In product selection, Theo - t3․gg suggests that if a developer hates writing code, lacks motivation, or wants the coding process to be more fun and fulfilling, Claude Code is a good choice. Through its terminal, multi-agent, animations, and strong feedback mechanisms, it keeps users feeling “I’m making efficient progress.”

For experienced engineers skeptical of AI tools who want minimal disruption and reliable task completion when needed, Codex is more suitable. He believes Codex is more “built by engineers, for engineers,” emphasizing stability, verification, and integration into existing workflows.

Reference links:

https://x.com/thsottiaux/status/2061106703446450392

https://www.reddit.com/r/codex/comments/1tsydiy/reset_just_happened/?utm_source=chatgpt.com

https://costhawk.ai/leaderboard

https://techforward.io/why-the-20-to-200-pricing-leap-in-claude-code-and-codex/?utm_source=chatgpt.com

https://www.youtube.com/watch?v=JMYspR42HFM

https://www.youtube.com/watch?v=dcrASucavMk

Disclaimer: This article is original by InfoQ and does not represent the platform’s views. Reproduction without permission is prohibited.

Today’s Recommended Reads

Ant Group’s Three Generations of CTOs in Closed-Door Dialogue: Cycles, Technical Decisions, and AI-Native Organizations

MiniMax, with Average Age of 95, Sees Market Cap Quadruple in 5 Months on HKEX, Now Aiming for A-Share Listing!

Half Chinese, 3 Billionaires: This Decade-Old Quant Intern Photo Hides the New Rich Map of the AI Era

miHoYo Burns 2 Million Yuan in Tokens Overnight; Big Tech Execs Question: Tokens Burn Without Value, But Who Gets Fat?

Conference Recommendation

Enterprise-level Agent deployment must address 4 real engineering problems. How to balance Agent security and usability? What memory systems do Agents need to truly understand context? How to achieve the ultimate balance between intelligence gains and cost control through algorithmic optimization? How to make multi-agent collaboration observable, governable, and controllable? On June 26-27, at AICon Global AI Development and Application Conference · Shanghai, domestic leading companies’ Agent practices will be thoroughly discussed.