Alibaba is launching compact Qwen 3.5 models that run directly on laptops and outperform OpenAI counterparts.
Alibaba Qwen 3.5 – a new series of compact AI models
Within its artificial intelligence division, Alibaba announced the release of the Qwen 3.5 model line, which promises a “small size + high efficiency.” Compared with American counterparts, they demonstrate better accuracy while using less memory.
| Model | Size | Key Features |
|---|---|---|
| Qwen 3.5‑0.8B | 0.8 billion parameters | “Tiny” and “fast,” designed for prototypes and mobile devices with limited autonomy |
| Qwen 3.5‑2B | 2 billion | Similar to 0.8B but a bit more powerful |
| Qwen 3.5‑4B | 4 billion | Multimodal, context window of 262,144 tokens; suitable for lightweight agent solutions |
| Qwen 3.5‑9B | 9 billion | Capable of reasoning, outperforms OpenAI gpt‑oss‑120B (13.5× more parameters) and demonstrates graduate-level logical thinking |
All models are available under the Apache 2.0 license, allowing use in commercial projects and further fine‑tuning if needed.
What’s new in the architecture?
Alibaba abandoned classic Transformers and adopted a hybrid scheme:
* Gated Delta Networks (GDN) – provide high throughput and low latency.
* Mixture‑of‑Experts (MoE) – address limited memory typical of small models.
Thanks to this, Qwen 3.5 can handle multimodal tokens directly instead of “attaching” image generators to text models as previous generations did. Consequently, the 4B and 9B versions can recognize UI elements and count objects in video.
Tests and results
| Benchmark | Qwen 3.5‑9B |
|-----------|-------------|
| MMMU‑Pro (visual) | 70.1 % – surpassed Google Gemini 2.5 Flash‑Lite (59.7) and specialized Qwen 3‑VL‑30B‑A3B (63.0) |
| Logical Reasoning | 81.7 % – higher than OpenAI gpt‑oss‑120B (80.1), though the latter has ten times more parameters |
| HMMT Feb 2025 (mathematics) | 83.2 % (9B), 74.0 % (4B) – proved that large cloud resources aren’t needed for complex scientific tasks |
| OmniDocBench v1.58 | 87.7 % – leader among all models |
| MMMLU (multilingualism) | 81.2 % – surpassed gpt‑oss‑120B (78.2 %) |
Why is this important?
The arrival of Qwen 3.5 coincided with growing demand for autonomous AI agents. Modern users require not only chatbots but systems that:
1. Think – reason about tasks.
2. See – process images, video, and UI elements.
3. Act – use tools (fill forms, sort files).
Given that large models (three‑to‑five billion parameters) are expensive to run, Qwen 3.5 offers a more economical solution. Models can be run locally without cloud or API access, and reinforcement learning enables them to make “human-like decisions” – for example, organizing a desktop or developing code from a video recording.
Practical applications
* Mobile devices – 0.8 billion parameters fit easily on a smartphone and enable offline operation.
* Workstations – 9 billion parameters provide a full suite of agent AI functions without the cloud.
* Interface agencies – thanks to “pixel‑level binding,” models can navigate UI, fill forms, and sort files, executing simple natural‑language commands with about 90 % accuracy.
Thus, Alibaba Qwen 3.5 paves the way for more accessible, flexible, and powerful AI agents that can operate both in the cloud and locally, meeting the growing needs of today’s users.
Comments (0)
Share your thoughts — please be polite and stay on topic.
Log in to comment