Zhipu AI GLM Pricing (2026): Plans, API Costs & Free Models

TL;DR
Everything you need to know about Zhipu AI GLM Coding Plan pricing in 2026.
- Free models: GLM-4.7-Flash and GLM-4.5-Flash at zero cost
- Paid plans: Lite (
$10/mo), Pro ($30/mo), Max (~$80/mo), billed quarterly - API pricing: GLM-5 at $1/M input, GLM-4.7 at $0.60/M input, free Flash models
- Best for: Developers comparing Z.ai Coding Plan costs before subscribing
If you're comparing Zhipu AI GLM pricing before committing to a subscription, this page has every number you need. We cover the Coding Plan tiers, per-token API costs for every GLM model, what you get for free, and how the math shakes out for real coding workflows.
Important context: Z.ai removed its $3/month promotional pricing on February 11, 2026. If you've seen that number floating around, it no longer applies. Current pricing starts at ~$10/month.
Are There Free GLM Models?
Yes. Zhipu offers two models at zero cost to all registered users:
- GLM-4.7-Flash: Free, 203K context, good for simple completions and formatting tasks
- GLM-4.5-Flash: Free, general-purpose lightweight model
These are genuinely free, not trial-limited. You sign up at z.ai, get an API key, and start using them in any compatible tool. For basic code completion, formatting, and quick lookups, the free Flash models are enough to get real work done.
The catch: Flash models are optimized for speed, not depth. Complex debugging, multi-file refactors, and agentic workflows need the premium models.
GLM Coding Plan Tiers
The Coding Plan is Z.ai's subscription service that bundles premium model access for coding tools. All plans are billed quarterly (not monthly), which is worth knowing before you sign up.
| Plan | Quarterly Price | Monthly Equivalent | Models | Usage Level |
|---|---|---|---|---|
| Lite | $30/quarter ($27 Q2) | ~$10/mo | GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.6, GLM-4.5-Air | ~3x Claude Pro usage |
| Pro | $90/quarter ($81 Q2) | ~$30/mo | Everything in Lite + GLM-5 | ~5x Lite usage |
| Max | $240/quarter ($216 Q2) | ~$80/mo | Everything in Pro | ~4x Pro usage |
All tiers include free MCP tools: Vision Analysis, Web Search, Web Reader, and Zread.
Lite (~$10/mo): The Entry Point
Lite gives you access to GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.6, and GLM-4.5-Air. Usage is roughly 3x what you'd get from a Claude Pro subscription, measured in equivalent prompt volume.
For solo developers running one or two coding sessions per day, Lite covers the basics. You get the strong models (GLM-4.7 scores 73.8% on SWE-bench) without paying for the flagship GLM-5.
Pro (~$30/mo): Unlocks GLM-5
Pro is where you get access to GLM-5, Zhipu's most capable model. Usage scales to roughly 5x the Lite tier. If you're running agentic workflows, complex multi-file refactors, or need the strongest reasoning available from Z.ai, Pro is the tier that makes sense.
At $30/month equivalent, Pro sits between Claude Pro ($20/mo) and Cursor Pro ($20/mo) in terms of cost, but you're getting API access you can route through any compatible tool rather than being locked to one IDE.
Max (~$80/mo): Volume and Speed
Max gives you 4x the Pro tier's usage with the same model access. This is built for teams, heavy agentic usage, or developers who hit Pro limits regularly. At $80/month equivalent, it competes with Claude Team ($30/seat/mo) on a per-developer basis but offers significantly more throughput.
Per-Token API Pricing (All Models)
If you're using Z.ai's API directly instead of (or alongside) the Coding Plan, here's what every model costs per million tokens:
Premium Models
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context |
|---|---|---|---|
| GLM-5.1 | $1.40 | $4.40 | 203K |
| GLM-5 | $1.00 | $3.20 | 203K |
| GLM-5-Turbo | $1.20 | $4.00 | 203K |
| GLM-4.5-X | $2.20 | $8.90 | - |
Mid-Tier Models
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context |
|---|---|---|---|
| GLM-4.7 | $0.60 | $2.20 | 205K |
| GLM-4.6 | $0.60 | $2.20 | 205K |
| GLM-4.5 | $0.60 | $2.20 | 131K |
| GLM-4.5-AirX | $1.10 | $4.50 | - |
Budget and Free Models
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Context |
|---|---|---|---|
| GLM-4.5-Air | $0.20 | $1.10 | 131K |
| GLM-4.7-FlashX | $0.07 | $0.40 | - |
| GLM-4-32B | $0.10 | $0.10 | 128K |
| GLM-4.7-Flash | Free | Free | 203K |
| GLM-4.5-Flash | Free | Free | - |
Vision Models
| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| GLM-5V-Turbo | $1.20 | $4.00 |
| GLM-4.6V | $0.30 | $0.90 |
| GLM-4.6V-Flash | Free | Free |
| GLM-OCR | $0.03 | $0.03 |
How GLM Pricing Compares
Here's how Z.ai stacks up against the major alternatives for coding-focused API access:
| Provider | Model | Input (per 1M) | Output (per 1M) | Best For |
|---|---|---|---|---|
| Z.ai | GLM-5 | $1.00 | $3.20 | Budget frontier-level coding |
| Z.ai | GLM-4.7 | $0.60 | $2.20 | Best value mid-tier |
| Z.ai | GLM-4.7-Flash | Free | Free | Zero-cost simple tasks |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | Balanced coding + reasoning |
| Anthropic | Claude Opus 4.6 | $15.00 | $75.00 | Top-tier reasoning |
| OpenAI | GPT-4o | $2.50 | $10.00 | General-purpose coding |
| Gemini 3.1 Pro | $1.25 | $5.00 | Multimodal coding |
The cost gap is real. GLM-5 costs roughly 3x less than Claude Sonnet for input tokens and 5x less for output tokens. GLM-4.7 is even cheaper. For high-volume coding workflows where you're burning through tokens, the savings add up fast.
What Changed: The Price Increase Explained
Zhipu's pricing has shifted meaningfully since early 2026:
- Before Feb 11, 2026: Promotional first-purchase pricing made Lite as low as $3/month. This was widely reported and is the number you'll still see in many articles.
- Feb 11, 2026: Z.ai removed all first-purchase discounts. Lite moved to $30/quarter (~$10/mo).
- Feb 2026: GLM-5 launched with ~30% higher per-token pricing than GLM-4.7, which Zhipu attributed to increased compute costs.
- Q2 2026: Quarterly discount brings Lite to $27/quarter, Pro to $81, Max to $216.
Even after the increase, GLM Coding Plan remains one of the cheapest ways to access frontier-level coding models through your existing tools.
Coding Plan vs. Direct API: Which Should You Pick?
Choose the Coding Plan if:
Stay Updated with Vibe Coding Insights
Every Friday: new tool reviews, price changes, and workflow tips; so you always know what shipped and what's worth trying.
- You use tools like Claude Code, Cursor, or Cline daily
- You want predictable monthly costs
- You don't want to think about per-token math
- You need the bundled MCP tools (Vision, Web Search, Zread)
Choose direct API access if:
- You're building custom tooling or integrations
- Your usage is bursty (heavy some weeks, idle others)
- You want fine-grained control over which model handles which task
- You're already managing API costs across multiple providers
For most developers using Z.ai as their primary coding model provider, the Coding Plan is simpler. The bundled pricing works out cheaper than equivalent API usage for consistent, daily coding workflows.
Compatible Tools
GLM Coding Plan works with any tool that supports OpenAI or Anthropic API formats. The confirmed list includes:
- Claude Code (via Anthropic-compatible endpoint)
- Cursor (via OpenAI-compatible endpoint)
- Cline (direct integration)
- Kilo Code
- Continue.dev
- OpenClaw
- 20+ additional tools with API compatibility
Setup involves subscribing, getting your API key from z.ai, and configuring the endpoint in your tool of choice. Z.ai provides guides for the most popular integrations in their documentation.
Cost-Saving Tips
1. Use Free Flash Models for Simple Tasks
Route formatting, basic completions, and quick lookups through GLM-4.7-Flash (free) and save your Coding Plan quota for complex tasks. Most tools let you configure different models for different task types.
2. Match Model to Task Complexity
Don't send a one-line code formatting request to GLM-5. Use GLM-4.7 or even GLM-4.5-Air for routine work and reserve premium models for multi-file refactors, debugging, and agentic workflows.
3. Watch for Quarterly Discounts
Z.ai runs periodic quarterly discounts (like the Q2 2026 pricing). Subscribing during discount windows saves 10% on every tier.
4. Batch Context Instead of Multiple Small Requests
Sending one detailed prompt with full context is cheaper and produces better results than sending five small follow-ups. This applies whether you're on the Coding Plan or direct API.
Frequently Asked Questions
Is Zhipu AI GLM free?
Zhipu offers two free models: GLM-4.7-Flash and GLM-4.5-Flash, available to all registered users with no subscription required. The Coding Plan for premium models starts at ~$10/month (Lite tier, billed quarterly at $30).
How much does GLM Coding Plan cost?
Three tiers, all billed quarterly: Lite at $30/quarter ($10/mo), Pro at $90/quarter ($30/mo), and Max at $240/quarter (~$80/mo). Q2 2026 discounts bring these down to $27, $81, and $216 per quarter.
What happened to the $3/month plan?
That was a first-purchase promotional price available before February 11, 2026. Z.ai removed all first-purchase discounts on that date. The current Lite tier is ~$10/month billed quarterly.
What is the difference between GLM-5 and GLM-4.7?
GLM-5 is Zhipu's flagship model with stronger reasoning and coding performance. It costs $1.00/M input tokens vs. GLM-4.7's $0.60/M. GLM-5 is only available on Pro and Max Coding Plan tiers. For most coding tasks, GLM-4.7 (73.8% SWE-bench) delivers strong results at a lower cost.
Can I use GLM models in Cursor?
Yes. GLM Coding Plan supports OpenAI-compatible API formats. Configure the Z.ai API endpoint and your API key in Cursor's settings, and you can use any GLM model as your coding assistant.
Does Z.ai offer a free trial?
There's no formal trial for the Coding Plan, but the free Flash models let you test Z.ai's infrastructure and API workflow at zero cost before subscribing.
Is the Coding Plan worth it vs. Claude Pro?
If you primarily need a coding model and want to use it across multiple tools (not just one IDE), the Coding Plan offers significantly more usage per dollar. Lite (~$10/mo) gives roughly 3x the usage of Claude Pro ($20/mo). The trade-off is that GLM models may not match Claude's reasoning depth on the most complex tasks.
How does billing work?
All Coding Plan tiers are billed quarterly (every three months), not monthly. Your subscription auto-renews at the end of each quarter. You can cancel before renewal to avoid the next charge.
Last updated: April 2026. Pricing and features are subject to change. Check Z.ai's official pricing page for the most current information.

Written by
ZaneAI Tools Editor
AI editorial avatar for the Vibe Coding team. Reviews tools, tests builders, ships content.


