Connect with us

News

Claude Sonnet 4 Gets Massive Upgrade With 1 Million Token Context Limit — But Only via API for Now

Published

on

Anthropic just gave its Claude Sonnet 4 model a serious memory boost. The chatbot can now handle up to 1 million tokens of context — roughly five times its previous limit — but there’s a catch: it’s API-only for the moment.

API Users Get First Dibs on Long Context

The new upgrade is rolling out to customers on Anthropic’s API who have Tier 4 or custom rate limits. Wider availability is expected in the coming weeks, but for now, everyday Claude web and mobile users will have to wait.

Anthropic says long context is also live on Amazon Bedrock and will arrive soon on Google Cloud’s Vertex AI.

The increase from the old limit means developers can keep far more information in a single conversation without hitting the memory ceiling.

anthropic claude sonnet 4 AI model API interface

What 1 Million Tokens Really Means

A million tokens is not just a bigger number on paper — it’s a huge leap in practical use. That’s about 75,000+ lines of code, or hundreds of documents, all remembered in a single session without Claude losing track.

Previously, users had to send details in smaller batches, constantly working around the limit. That also meant Claude could “forget” earlier details as new information pushed older context out.

Now, entire codebases with dependencies can be loaded in one go, letting Claude analyze, refactor, or debug without losing its place.

Not Every Model Gets the Upgrade

The change applies only to Claude Sonnet 4. Anthropic’s high-end Opus 4.1 is still working under the older limits — largely due to higher operating costs.

That distinction matters for teams deciding which Claude variant to use for large-scale projects.

Developers Eye New Possibilities

With 1M tokens, Anthropic points to some standout use cases:

  • Analyzing hundreds of PDFs or legal contracts at once

  • Running AI agents that keep context across hundreds of tool calls

  • Loading full software projects into a single chat without cutting them up

Pricing will increase for prompts over 200K tokens, but Anthropic says prompt caching can cut costs and speed things up.

Web and Mobile Users Will Have to Wait

The company says the upgrade will reach Claude’s mobile and web apps “at some point in the future,” but there’s no firm timeline yet.

That leaves API-connected tools — and developers — as the first to test just how far this new memory stretch can go.

Hayden Patrick is a writer who specializes in entertainment and sports. He is passionate about movies, music, games, and sports, and he shares his opinions and reviews on these topics. He also writes on other topics when there is no one available, such as health, education, business, and more.

Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

TRENDING