Z.ai Launches Open-Source GLM-4.5 AI Models
Chinese startup Z.ai introduced GLM-4.5 and its lightweight variant GLM-4.5-Air, Apache 2.0 licensed LLMs designed for advanced reasoning, coding, and agentic tasks. Benchmarks show they match or beat proprietary models like Claude 4 Sonnet and Gemini 2.5 Pro. Available via API, self-hosting, or on HuggingFace, these models promise flexible, cost-effective enterprise deployment.
Z.ai Unveils GLM-4.5 LLM Family
Chinese startup Z.ai has launched its GLM-4.5 family of open-source large language models, including the flagship GLM-4.5 and the lightweight GLM-4.5-Air. Both are available under an Apache 2.0 license, enabling free commercial use, modification, and self-hosting. Z.ai claims performance on par with or exceeding leading proprietary systems.
Top-Tier Performance on Benchmarks
- GLM-4.5 ranks third across 12 industry benchmarks—trailing only OpenAI’s GPT-4 and xAI’s Grok 4.
- Matches or surpasses Claude 4 Sonnet and Gemini 2.5 Pro on BrowseComp, AIME24, and SWE-bench Verified tests.
- GLM-4.5-Air holds a top-six spot while delivering faster inference and lower resource requirements.
Dual Modes and Key Features
- Thinking mode for complex reasoning, tool integration, and multi-step workflows.
- Non-thinking mode for rapid, single-turn responses when speed matters.
- Automates full PowerPoint slide decks from a single prompt—ideal for meetings and reports.
- Supports code generation, creative copywriting, emotional messaging, and virtual character dialogues.
Apache 2.0 Licensing and Pricing
Both GLM-4.5 and GLM-4.5-Air are available under Apache 2.0, permitting free commercial use, modification, and self-hosting. Cloud API pricing starts at $0.60/$0.20 per million input/output tokens with volume discounts. Western teams should review data sovereignty considerations due to Chinese hosting.
Architecture and Efficiency
GLM-4.5 employs a 355 billion-parameter Mixture-of-Experts design with 32 billion active experts, Grouped-Query Attention, and Multi-Token Prediction for speculative decoding. GLM-4.5-Air scales down to 106 billion parameters. Both models trained on trillions of tokens and fine-tuned via Z.ai’s in-house RL framework, slime, for agentic capabilities.
Enterprise Impact
For AI engineers and orchestration teams, the GLM-4.5 family delivers a high-performing, cost-effective alternative to closed LLMs. Its open-source license grants full weight access and flexible deployment—cloud, on-prem, or hybrid—while preserving streaming APIs, structured outputs, and tool-calling for mission-critical workflows.
Integration and Demos
Developers can access code on HuggingFace and ModelScope or integrate via Z.ai’s API with vLLM and SGLang support. Interactive demos—including Flappy Bird clones, Pokédex web apps, and dynamic slide generators—showcase rapid prototyping, content creation, and agent-driven experiences.
The Growing Chinese Open Source Wave
GLM-4.5 joins a wave of recent Chinese open-source releases—such as Alibaba’s Qwen3 and Wan 2.2—that challenge U.S. proprietary models. With permissive Apache 2.0 licenses and competitive benchmarks, these efforts broaden choices for enterprises and drive global innovation in foundational AI infrastructure.
Keep Reading
View AllChrome Adds AI-Powered Store Reviews
Google Chrome now offers AI-generated summaries of online store reviews, merging partner data and verified feedback for quick insights on products and services.
Anthropic Enforces Weekly Limits on Claude AI Coding Tool
Anthropic will roll out weekly rate limits for its Claude coding AI on August 28 to curb nonstop usage and prevent account sharing.
Microsoft Edge Unveils AI-Powered Copilot Mode for Smarter Browsing
Microsoft launches Copilot Mode in Edge, integrating AI helpers for browsing, research and task automation in an opt-in experimental feature.
AI Tools Built for Agencies That Move Fast.
Explore how QuarkyByte’s deep-dive analytics can help your team evaluate GLM-4.5 in sandbox environments and integrate open-source LLMs into existing workflows. Partner with QuarkyByte to benchmark performance, optimize deployment costs, and maintain compliance across global infrastructures.