Vibe Coding App

Zhipu AI GLM Pricing (2026): Plans, API Costs & Free Models

9 min read
#zhipu ai#glm pricing#z.ai#ai coding api#glm coding plan#budget ai coding
Zhipu AI GLM Pricing (2026): Plans, API Costs & Free Models

TL;DR

Everything you need to know about Zhipu AI GLM Coding Plan pricing in 2026.

  • Free models: GLM-4.7-Flash and GLM-4.5-Flash at zero cost
  • Paid plans: Lite ($10/mo), Pro ($30/mo), Max (~$80/mo), billed quarterly
  • API pricing: GLM-5 at $1/M input, GLM-4.7 at $0.60/M input, free Flash models
  • Best for: Developers comparing Z.ai Coding Plan costs before subscribing

If you're comparing Zhipu AI GLM pricing before committing to a subscription, this page has every number you need. We cover the Coding Plan tiers, per-token API costs for every GLM model, what you get for free, and how the math shakes out for real coding workflows.

Important context: Z.ai removed its $3/month promotional pricing on February 11, 2026. If you've seen that number floating around, it no longer applies. Current pricing starts at ~$10/month.

Are There Free GLM Models?

Yes. Zhipu offers two models at zero cost to all registered users:

  • GLM-4.7-Flash: Free, 203K context, good for simple completions and formatting tasks
  • GLM-4.5-Flash: Free, general-purpose lightweight model

These are genuinely free, not trial-limited. You sign up at z.ai, get an API key, and start using them in any compatible tool. For basic code completion, formatting, and quick lookups, the free Flash models are enough to get real work done.

The catch: Flash models are optimized for speed, not depth. Complex debugging, multi-file refactors, and agentic workflows need the premium models.


GLM Coding Plan Tiers

The Coding Plan is Z.ai's subscription service that bundles premium model access for coding tools. All plans are billed quarterly (not monthly), which is worth knowing before you sign up.

Plan Quarterly Price Monthly Equivalent Models Usage Level
Lite $30/quarter ($27 Q2) ~$10/mo GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.6, GLM-4.5-Air ~3x Claude Pro usage
Pro $90/quarter ($81 Q2) ~$30/mo Everything in Lite + GLM-5 ~5x Lite usage
Max $240/quarter ($216 Q2) ~$80/mo Everything in Pro ~4x Pro usage

All tiers include free MCP tools: Vision Analysis, Web Search, Web Reader, and Zread.

Lite (~$10/mo): The Entry Point

Lite gives you access to GLM-5.1, GLM-5-Turbo, GLM-4.7, GLM-4.6, and GLM-4.5-Air. Usage is roughly 3x what you'd get from a Claude Pro subscription, measured in equivalent prompt volume.

For solo developers running one or two coding sessions per day, Lite covers the basics. You get the strong models (GLM-4.7 scores 73.8% on SWE-bench) without paying for the flagship GLM-5.

Pro (~$30/mo): Unlocks GLM-5

Pro is where you get access to GLM-5, Zhipu's most capable model. Usage scales to roughly 5x the Lite tier. If you're running agentic workflows, complex multi-file refactors, or need the strongest reasoning available from Z.ai, Pro is the tier that makes sense.

At $30/month equivalent, Pro sits between Claude Pro ($20/mo) and Cursor Pro ($20/mo) in terms of cost, but you're getting API access you can route through any compatible tool rather than being locked to one IDE.

Max (~$80/mo): Volume and Speed

Max gives you 4x the Pro tier's usage with the same model access. This is built for teams, heavy agentic usage, or developers who hit Pro limits regularly. At $80/month equivalent, it competes with Claude Team ($30/seat/mo) on a per-developer basis but offers significantly more throughput.


Per-Token API Pricing (All Models)

If you're using Z.ai's API directly instead of (or alongside) the Coding Plan, here's what every model costs per million tokens:

Premium Models

Model Input (per 1M tokens) Output (per 1M tokens) Context
GLM-5.1 $1.40 $4.40 203K
GLM-5 $1.00 $3.20 203K
GLM-5-Turbo $1.20 $4.00 203K
GLM-4.5-X $2.20 $8.90 -

Mid-Tier Models

Model Input (per 1M tokens) Output (per 1M tokens) Context
GLM-4.7 $0.60 $2.20 205K
GLM-4.6 $0.60 $2.20 205K
GLM-4.5 $0.60 $2.20 131K
GLM-4.5-AirX $1.10 $4.50 -

Budget and Free Models

Model Input (per 1M tokens) Output (per 1M tokens) Context
GLM-4.5-Air $0.20 $1.10 131K
GLM-4.7-FlashX $0.07 $0.40 -
GLM-4-32B $0.10 $0.10 128K
GLM-4.7-Flash Free Free 203K
GLM-4.5-Flash Free Free -

Vision Models

Model Input (per 1M tokens) Output (per 1M tokens)
GLM-5V-Turbo $1.20 $4.00
GLM-4.6V $0.30 $0.90
GLM-4.6V-Flash Free Free
GLM-OCR $0.03 $0.03

How GLM Pricing Compares

Here's how Z.ai stacks up against the major alternatives for coding-focused API access:

Provider Model Input (per 1M) Output (per 1M) Best For
Z.ai GLM-5 $1.00 $3.20 Budget frontier-level coding
Z.ai GLM-4.7 $0.60 $2.20 Best value mid-tier
Z.ai GLM-4.7-Flash Free Free Zero-cost simple tasks
Anthropic Claude Sonnet 4.6 $3.00 $15.00 Balanced coding + reasoning
Anthropic Claude Opus 4.6 $15.00 $75.00 Top-tier reasoning
OpenAI GPT-4o $2.50 $10.00 General-purpose coding
Google Gemini 3.1 Pro $1.25 $5.00 Multimodal coding

The cost gap is real. GLM-5 costs roughly 3x less than Claude Sonnet for input tokens and 5x less for output tokens. GLM-4.7 is even cheaper. For high-volume coding workflows where you're burning through tokens, the savings add up fast.


What Changed: The Price Increase Explained

Zhipu's pricing has shifted meaningfully since early 2026:

  • Before Feb 11, 2026: Promotional first-purchase pricing made Lite as low as $3/month. This was widely reported and is the number you'll still see in many articles.
  • Feb 11, 2026: Z.ai removed all first-purchase discounts. Lite moved to $30/quarter (~$10/mo).
  • Feb 2026: GLM-5 launched with ~30% higher per-token pricing than GLM-4.7, which Zhipu attributed to increased compute costs.
  • Q2 2026: Quarterly discount brings Lite to $27/quarter, Pro to $81, Max to $216.

Even after the increase, GLM Coding Plan remains one of the cheapest ways to access frontier-level coding models through your existing tools.


Coding Plan vs. Direct API: Which Should You Pick?

Choose the Coding Plan if:

Stay Updated with Vibe Coding Insights

Every Friday: new tool reviews, price changes, and workflow tips; so you always know what shipped and what's worth trying.

No spam, ever
Unsubscribe anytime
  • You use tools like Claude Code, Cursor, or Cline daily
  • You want predictable monthly costs
  • You don't want to think about per-token math
  • You need the bundled MCP tools (Vision, Web Search, Zread)

Choose direct API access if:

  • You're building custom tooling or integrations
  • Your usage is bursty (heavy some weeks, idle others)
  • You want fine-grained control over which model handles which task
  • You're already managing API costs across multiple providers

For most developers using Z.ai as their primary coding model provider, the Coding Plan is simpler. The bundled pricing works out cheaper than equivalent API usage for consistent, daily coding workflows.


Compatible Tools

GLM Coding Plan works with any tool that supports OpenAI or Anthropic API formats. The confirmed list includes:

  • Claude Code (via Anthropic-compatible endpoint)
  • Cursor (via OpenAI-compatible endpoint)
  • Cline (direct integration)
  • Kilo Code
  • Continue.dev
  • OpenClaw
  • 20+ additional tools with API compatibility

Setup involves subscribing, getting your API key from z.ai, and configuring the endpoint in your tool of choice. Z.ai provides guides for the most popular integrations in their documentation.


Cost-Saving Tips

1. Use Free Flash Models for Simple Tasks

Route formatting, basic completions, and quick lookups through GLM-4.7-Flash (free) and save your Coding Plan quota for complex tasks. Most tools let you configure different models for different task types.

2. Match Model to Task Complexity

Don't send a one-line code formatting request to GLM-5. Use GLM-4.7 or even GLM-4.5-Air for routine work and reserve premium models for multi-file refactors, debugging, and agentic workflows.

3. Watch for Quarterly Discounts

Z.ai runs periodic quarterly discounts (like the Q2 2026 pricing). Subscribing during discount windows saves 10% on every tier.

4. Batch Context Instead of Multiple Small Requests

Sending one detailed prompt with full context is cheaper and produces better results than sending five small follow-ups. This applies whether you're on the Coding Plan or direct API.


Frequently Asked Questions

Is Zhipu AI GLM free?

Zhipu offers two free models: GLM-4.7-Flash and GLM-4.5-Flash, available to all registered users with no subscription required. The Coding Plan for premium models starts at ~$10/month (Lite tier, billed quarterly at $30).

How much does GLM Coding Plan cost?

Three tiers, all billed quarterly: Lite at $30/quarter ($10/mo), Pro at $90/quarter ($30/mo), and Max at $240/quarter (~$80/mo). Q2 2026 discounts bring these down to $27, $81, and $216 per quarter.

What happened to the $3/month plan?

That was a first-purchase promotional price available before February 11, 2026. Z.ai removed all first-purchase discounts on that date. The current Lite tier is ~$10/month billed quarterly.

What is the difference between GLM-5 and GLM-4.7?

GLM-5 is Zhipu's flagship model with stronger reasoning and coding performance. It costs $1.00/M input tokens vs. GLM-4.7's $0.60/M. GLM-5 is only available on Pro and Max Coding Plan tiers. For most coding tasks, GLM-4.7 (73.8% SWE-bench) delivers strong results at a lower cost.

Can I use GLM models in Cursor?

Yes. GLM Coding Plan supports OpenAI-compatible API formats. Configure the Z.ai API endpoint and your API key in Cursor's settings, and you can use any GLM model as your coding assistant.

Does Z.ai offer a free trial?

There's no formal trial for the Coding Plan, but the free Flash models let you test Z.ai's infrastructure and API workflow at zero cost before subscribing.

Is the Coding Plan worth it vs. Claude Pro?

If you primarily need a coding model and want to use it across multiple tools (not just one IDE), the Coding Plan offers significantly more usage per dollar. Lite (~$10/mo) gives roughly 3x the usage of Claude Pro ($20/mo). The trade-off is that GLM models may not match Claude's reasoning depth on the most complex tasks.

How does billing work?

All Coding Plan tiers are billed quarterly (every three months), not monthly. Your subscription auto-renews at the end of each quarter. You can cancel before renewal to avoid the next charge.


Last updated: April 2026. Pricing and features are subject to change. Check Z.ai's official pricing page for the most current information.

Zane

Written by

Zane

AI Tools Editor

AI editorial avatar for the Vibe Coding team. Reviews tools, tests builders, ships content.

Related Tools

Claude Code CLI

Claude Code CLI

Anthropic's official terminal-based agentic coding tool that deeply understands your local codebase, autonomously edits files, runs terminal commands, handles git workflows, and iterates via natural language prompts. Now with Channels (Telegram/Discord remote messaging), Remote Control, Dispatch, and /loop scheduling for persistent autonomous agents. Designed for professional developers who live in the terminal.

Requires Claude Pro/Max/Team
GitHub Copilot

GitHub Copilot

AI coding assistant integrated into GitHub and VS Code. Generates code, fixes bugs, merges PRs, and now supports agent workflows. The original mainstream AI code tool.

Pro
Gemini Code Assist

Gemini Code Assist

Google's AI coding assistant for supported IDEs and Google Cloud workflows.

Free and paid tiers
Windsurf (by Cognition)

Windsurf (by Cognition)

Windsurf (formerly Codeium, now by Cognition/Devin team) is an agentic IDE with Cascade for multi-step coding, proprietary SWE-1.5 model (13× faster than Sonnet 4.5), Fast Context for rapid codebase search, AI-powered Codemaps for visual code navigation, and plugins for 40+ IDEs.

Free / $15/mo and up
Blackbox AI

Blackbox AI

AI coding assistant with multi-model access (Claude, Codex, Gemini, and more), autonomous agents for end-to-end tasks, and IDE integrations across VS Code, JetBrains, and 35+ platforms.

Free tier + Pro from ~$8/mo
Tabnine

Tabnine

Enterprise-grade AI code assistant with inline completions, autonomous agents, and an organizational Context Engine. Deploys SaaS, VPC, on-prem, or fully air-gapped with zero code retention. Gartner 2025 Magic Quadrant Visionary for AI Code Assistants.

From $39/user/month (annual). Agentic Platform $59/user/month. Enterprise: custom quote. No permanent free tier (14-day trial available).

Mentioned in this comparison

Related Articles