In short
- GLM-5.2 trails Claude Opus 4.8 by simply 1% on FrontierSWE—a benchmark measuring multi-hour autonomous engineering tasks—whereas beating GPT-5.5 on the identical check. It ships below an MIT license with zero regional restrictions.
- The mannequin was constructed fully on Huawei Ascend chips with no NVIDIA {hardware} concerned.
- Unsloth AI already launched 2-bit GGUF quantizations that shrink the mannequin from 1.51TB to 238GB. You will nonetheless want 256GB of RAM or VRAM—however at that time, you possibly can run it.
Z.ai dropped GLM-5.2 on June 16, promising prime degree performances, beating its already superior GLM 5.1.
The Beijing-based lab, which has been on the U.S. Entity Checklist since January 2025, seems to be benefiting from rising issues over America’s method to AI. Over the previous week, the ban on Anthropic Fable and the discharge of this new mannequin have helped drive zAI’s replenish 90%, sending it to a brand new all-time excessive.

GLM 5.2 has the numbers to again up the hype.
On FrontierSWE—a benchmark that evaluates whether or not an AI agent can full open-ended technical tasks measured in hours, protecting methods optimization, large-scale code development, and utilized ML analysis, scored by dominance price—GLM-5.2 hit 74.4 towards Claude Opus 4.8’s 75.1. It edged out GPT-5.5 at 72.6. On SWE-bench Professional, which checks autonomous decision of real-world GitHub points scored as a go price, GLM-5.2 scored 62.1 to GPT-5.5’s 58.6—and cleared its predecessor GLM-5.1’s 58.4 by a large margin.
The standard leap makes it the perfect open-source mannequin to this point within the Synthetic Evaluation Intelligence Index, which aggregates the outcomes of 9 totally different scores to evaluate the overall high quality of an AI mannequin. OpenRouter’s benchmarks put it in the identical class because the now banned Claude Fable 5.

The {hardware} used to realize this feat is one other fascinating a part of the story. GLM-5.2 was educated on Huawei Ascend chips—no Nvidia wherever within the pipeline. Emad Mostaque, founding father of Stability AI, estimated whole coaching prices at round $25 million, 80% of that in post-training, which might make it extraordinarily low-cost compared towards its friends.
As Decrypt reported earlier this 12 months, Z.ai was already coaching picture fashions on Huawei’s Ascend Atlas servers with no single American chip. GLM-5.2 takes that infrastructure additional—a 744-billion-parameter mixture-of-experts mannequin with a real 1 million-token context window, 5 occasions the 200K restrict on GLM-5.1, and an MIT license which means no authorities directive can flip the entry change.
Tokens are the chunks of tet a mannequin can learn and generate whereas Parameters are the variety of inner settings and values that decide how a mannequin processes data and generates responses
Who it is for and what it prices
For builders, the context window is the operational shift. Complete-repo navigation, multi-file refactors, and lengthy agentic pipelines that beforehand required chunking develop into single-call workflows. API pricing runs $1.40 per million enter tokens and $4.40 per million output—towards Claude Opus 4.8’s $5 enter and $25 output. The Coding Plan begins at round $18 a month and works immediately inside Claude Code, Cline, Kilo Code, and hottest agentic environments.
Native deployment can also be technically potential. Unsloth AI pushed 2-bit GGUF quantizations that compress the mannequin from 1.51TB right down to 238GB whereas retaining ~82% accuracy.
Don’t get too excited, although. That also means it calls for 256GB of unified reminiscence or an identical RAM/VRAM combo—a maxed M4 Extremely Mac Studio or a workstation with a mid-range GPU and 256GB of system RAM with mixture-of-experts offloading. It’s nonetheless some huge cash, however a minimum of one thing that you may purchase and run on your home should you actually wish to.
We ran a fast check, asking GLM-5.2 to construct our normal recreation mixing typing mechanics with a shooter. The UI wasn’t the prettiest—different fashions generated extra polished-looking interfaces, however the expertise was probably the most different: totally different eventualities throughout waves, enemy sorts that shifted, bosses showing later within the run.
It generated extra various recreation states than the rest we examined for a similar activity in a zero shot setup.

If you wish to play it, it’s reside in our Itch.io profile.
That variance factors towards the place GLM-5.2 makes probably the most financial sense. For multi-shot technology workflows and agentic pipelines the place output variety issues greater than polish, the mathematics at open-source pricing ranges is tough to argue with. For the toughest sustained duties—SWE-Marathon, the place it scores 13.0 towards Opus 4.8’s 26.0—the hole to the closed frontier remains to be actual, and 13 factors extensive.
Open-source weights are reside on HuggingFace below the MIT license. The quantized weights are additionally out there on HuggingFace. GLM Coding Plan subscribers can change now with the mannequin string GLM-5.2, and it’s additionally out there without spending a dime testing on z.AI with some utilization constraints.
Day by day Debrief Publication
Begin on daily basis with the highest information tales proper now, plus unique options, a podcast, movies and extra.
