All News

Z.ai Launches Open-Source GLM-4.5 AI Models

Chinese startup Z.ai introduced GLM-4.5 and its lightweight variant GLM-4.5-Air, Apache 2.0 licensed LLMs designed for advanced reasoning, coding, and agentic tasks. Benchmarks show they match or beat proprietary models like Claude 4 Sonnet and Gemini 2.5 Pro. Available via API, self-hosting, or on HuggingFace, these models promise flexible, cost-effective enterprise deployment.

Published July 29, 2025 at 01:11 AM EDT in Artificial Intelligence (AI)

Z.ai Unveils GLM-4.5 LLM Family

Chinese startup Z.ai has launched its GLM-4.5 family of open-source large language models, including the flagship GLM-4.5 and the lightweight GLM-4.5-Air. Both are available under an Apache 2.0 license, enabling free commercial use, modification, and self-hosting. Z.ai claims performance on par with or exceeding leading proprietary systems.

Top-Tier Performance on Benchmarks

  • GLM-4.5 ranks third across 12 industry benchmarks—trailing only OpenAI’s GPT-4 and xAI’s Grok 4.
  • Matches or surpasses Claude 4 Sonnet and Gemini 2.5 Pro on BrowseComp, AIME24, and SWE-bench Verified tests.
  • GLM-4.5-Air holds a top-six spot while delivering faster inference and lower resource requirements.

Dual Modes and Key Features

  • Thinking mode for complex reasoning, tool integration, and multi-step workflows.
  • Non-thinking mode for rapid, single-turn responses when speed matters.
  • Automates full PowerPoint slide decks from a single prompt—ideal for meetings and reports.
  • Supports code generation, creative copywriting, emotional messaging, and virtual character dialogues.

Apache 2.0 Licensing and Pricing

Both GLM-4.5 and GLM-4.5-Air are available under Apache 2.0, permitting free commercial use, modification, and self-hosting. Cloud API pricing starts at $0.60/$0.20 per million input/output tokens with volume discounts. Western teams should review data sovereignty considerations due to Chinese hosting.

Architecture and Efficiency

GLM-4.5 employs a 355 billion-parameter Mixture-of-Experts design with 32 billion active experts, Grouped-Query Attention, and Multi-Token Prediction for speculative decoding. GLM-4.5-Air scales down to 106 billion parameters. Both models trained on trillions of tokens and fine-tuned via Z.ai’s in-house RL framework, slime, for agentic capabilities.

Enterprise Impact

For AI engineers and orchestration teams, the GLM-4.5 family delivers a high-performing, cost-effective alternative to closed LLMs. Its open-source license grants full weight access and flexible deployment—cloud, on-prem, or hybrid—while preserving streaming APIs, structured outputs, and tool-calling for mission-critical workflows.

Integration and Demos

Developers can access code on HuggingFace and ModelScope or integrate via Z.ai’s API with vLLM and SGLang support. Interactive demos—including Flappy Bird clones, Pokédex web apps, and dynamic slide generators—showcase rapid prototyping, content creation, and agent-driven experiences.

The Growing Chinese Open Source Wave

GLM-4.5 joins a wave of recent Chinese open-source releases—such as Alibaba’s Qwen3 and Wan 2.2—that challenge U.S. proprietary models. With permissive Apache 2.0 licenses and competitive benchmarks, these efforts broaden choices for enterprises and drive global innovation in foundational AI infrastructure.

Keep Reading

View All
The Future of Business is AI

AI Tools Built for Agencies That Move Fast.

Explore how QuarkyByte’s deep-dive analytics can help your team evaluate GLM-4.5 in sandbox environments and integrate open-source LLMs into existing workflows. Partner with QuarkyByte to benchmark performance, optimize deployment costs, and maintain compliance across global infrastructures.