Chinese artificial intelligence (AI) startup Zipu AI or Z.AI has released an open-weighted GLM-5.2 model that matches Anthropic’s controversial Mythos class-model in cybersecurity and software vulnerability detection tasks. Researchers testing and comparing Frontier models say the Chinese AI company continues to have a significant cost advantage, first highlighted by DeepSeek early last year.

American cybersecurity company SemGrep, using the IDOR (Insecure Direct Object Reference) benchmark, which tests for a specific vulnerability where an application exposes an internal identifier such as a user ID without permission, noted that GLM-5.2 scored higher (39%) than Anthropic’s Cloud Opus 4.6 (32%) and Cloud Opus 4.8/4.7 (28%).
The 744-billion-parameter mixture-of-experts (MOE) model proved surprisingly excellent at reasoning through complex repository-scale code authorization errors. The researchers note, “Among models given the same minimal signal and harness, GLM5.2 is an open-weighted model, the cost frontier of LLM actually beats the cloud code in the difficult security research task.”
Z.ai released GLM-5.2 earlier this month, and the AI company focused on optimizing for ‘long horizon tasks’ or agentic tasks that were less reliant on high token usage.
“It is easy to claim a 1M reference, but much harder to maintain reliable under real engineering pressure. To this end, we have significantly expanded the 1M-reference training for coded-agent scenarios, including large-scale implementation, automated research, performance optimization, and complex debugging,” the company noted at the time.
The cost advantage that Chinese AI models have consistently demonstrated since DeepSeq took the AI world by storm early last year remains an advantage. In terms of estimated token cost, Z.ai GLM-5.2 costs $1.40 per million input tokens and $4.40 per million output tokens – in comparison, similar usage in Anthropic’s Cloud Opus 4.8 will cost developers and enterprises $5 and $25, respectively.
“GLM 5.2, without any scaffolding, beats Cloud Code by seven points (39% vs. 32%). An open-weights model run on a bare prompt outperformed the Frontier Coding Agent on the logic-heavy security task. And it did it very cheaply! At GLM 5.2’s pricing, the open-weights run was found to cost about $0.17 per vulnerability,” he adds.
Other LLMs in this test include Minimax M3, KMK2.7 Code, OpenAI GPT-5.5, and DeepSeek v4.
That said, GLM is not always better than other models from Anthropic and OpenAI in more general tasks. However, it is a representation that Chinese AI models have systematically narrowed the gap in average capabilities compared to other AI companies.
GLM-5.2 is one of the 10 most used AI models on AI marketplace OpenStreetMap’s LLM usage leaderboard, alongside models from Anthropic, DeepSeek, Xiaomi, and Tencent.
Unlike Anthropic’s cloud models or OpenAI’s GPT, open-weight models like GLM-5.2 can be downloaded and modified, meaning users can fine-tune them for specific tasks, operate them without relying on a commercial provider, and even remove safety guardrails. This will raise concerns about open models like GLM being used to mount cybersecurity attacks, as threat actors will have equally powerful tools at their disposal as cyber defense mechanisms.
Z.ai founder Ji Tang has already publicly stated his intention to bring out another open-source model that will compete directly with Anthropic’s Fable 5, the first “Mythos” model, before the end of this year.