TL;DR
- Closed Model: Alibaba released Qwen3.5-Omni as a proprietary API-only service, breaking from its open-source tradition for Qwen models.
- Multimodal Capabilities: Qwen3.5-Omni processes text, images, audio, and video natively, recognizes 113 languages, and demonstrated emergent coding abilities from video input.
- Community Impact: The proprietary shift affects over 290,000 developers and 113,000 community model variations built on Qwen’s open-source ecosystem.
- Leadership Turbulence: Alibaba’s AI division has lost three senior executives in 2026, including Qwen technical lead Lin Junyang, amid an internal restructuring.
Alibaba’s Qwen family ranks as the top downloaded open-source AI model ecosystem on Hugging Face, with over 113,000 community variations. Its newest model, Qwen3.5-Omni, is closed-source.
Qwen3.5-Omni launched on March 30, processing text, images, audio, and video natively in a single model, marking Alibaba’s entry into the omni-modal AI race. However, keeping it proprietary represents a departure from Alibaba’s previous open-source Qwen models and signals a potential monetization pivot as AI-related revenue grows as a share of Alibaba Cloud’s business.
Alibaba has not published model weights or named a license for Qwen3.5-Omni, making it available only as an API service. No official statement from Alibaba has addressed the reasoning behind keeping Qwen3.5-Omni proprietary, even as the company continues to benefit from the open-source community that grew around earlier Qwen releases.
What Qwen3.5-Omni Can Do
Available in three sizes (Plus, Flash, and Light), each variant supports a 256,000-token context window. Training drew on over 100 million hours of audio-visual data, producing a system that recognizes 113 languages and dialects in speech, up from 19 in the previous generation.
Building on this foundation, Qwen3.5-Omni generates speech in 36 languages, evolved from Qwen 3 Omni Flash released in December 2025, and represents Alibaba’s second major AI release in six weeks after Qwen 3.5’s text-and-vision model in February 2026.
In addition, several conversational AI features distinguish the model. Semantic interruption allows Qwen3.5-Omni to distinguish between conversational fillers like “uh-huh” and genuine attempts to interrupt. Voice cloning lets users upload a voice sample that the model adopts for its responses, though Alibaba restricts this feature to its API.
On benchmarks, Alibaba claims Qwen3.5-Omni outperforms Gemini 3.1 Pro on general audio understanding, reasoning, and translation tasks, while matching Google’s model on audio-visual comprehension.
Separately, Alibaba says Qwen3.5-Omni beat ElevenLabs, GPT-Audio, and Minimax on multilingual voice stability across 20 languages.
Qwen3.5-Omni can process a YouTube video in about one minute natively. By contrast, OpenAI’s GPT 5.4 requires approximately nine minutes using a stitched pipeline of frame extraction, Whisper transcription, and OCR.
Where ChatGPT 5.4 relies on separate modules for each modality, Qwen3.5-Omni ingests raw video, audio, and text simultaneously through a unified architecture.
As a result, Adaptive Rate Interleave Alignment (ARIA), a new synchronization technology, coordinates text and speech outputs for more natural responses. Combined with support for real-time web search and the ability to handle more than 10 hours of audio or over 400 seconds of 720p video, Qwen3.5-Omni positions itself as one of the broadest multimodal offerings currently available.
Emergent Capabilities
Not all of Qwen3.5-Omni’s capabilities were deliberately trained. An emergent capability called audio-visual vibe coding allows the model to watch screen recordings and write code from combined visual and audio input.
For instance, in demos Qwen3.5-Omni-Plus builds a working snake game from a verbal description paired with a video clip, without having been specifically trained on code-from-video tasks. Alibaba’s researchers noted the behavior arose spontaneously from multimodal training, raising questions about what other latent capabilities may exist in large-scale omni-modal architectures.
Beyond the snake game demo, the model also breaks down a three-minute lion documentary scene by scene, demonstrating that its video comprehension extends to long-form content analysis.
Why Open Source Matters, and Why Alibaba Walked Away
Alibaba’s Qwen models so far have been open-source under the Apache 2.0 license, a permissive framework that allowed anyone to modify and commercialize them. Open availability fueled adoption: Hugging Face CEO Clement Delangue noted that Chinese open-source models surpassed US in downloads on the platform for the first time in 2025, and a modified Qwen version currently holds the top spot on Hugging Face’s open AI model leaderboard.
Moreover, Qwen models have attracted over 290,000 developers globally, building a community that now faces uncertainty about future access.
Enterprise adoption further illustrates what is at stake. Airbnb uses Qwen for its AI customer service chatbot, as confirmed by CEO Brian Chesky, and Pinterest experiments with Qwen alongside in-house models. Steve Frey, an AI industry cofounder and product lead, noted that open-source models can reduce costs 5-10x through self-hosting and mixture-of-experts architectures. Frey attributed Qwen’s 113,000+ Hugging Face variations to open weights enabling enterprise customization, a value proposition now partially at risk with the proprietary shift.
Delangue’s framing highlights the significance of Alibaba’s departure. If the dominant Chinese AI lab begins closing its models, the open-source advantage that drove adoption could erode. Alibaba launched the Qwen3 family as open-source in April 2025, establishing the baseline from which this proprietary shift departs.
In September 2025, Alibaba also released Qwen3-Max as a closed-source commercial model, suggesting the proprietary pivot has been building for months.
As a consequence, for enterprises already running Qwen-based systems, the proprietary shift introduces uncertainty. Companies that built pipelines around open Qwen weights face questions about whether future models will remain accessible or whether Alibaba will increasingly gate advanced capabilities behind its cloud platform. Alibaba has not commented publicly on its long-term licensing strategy for the Qwen family.
Organizational Shifts and Outlook
Alibaba is restructuring around AI at an organizational level. CEO Eddie Wu consolidated AI operations under a new Alibaba Token Hub (ATH) division, bringing together five units: Tongyi Laboratory, MaaS Business Line, Qwen Business Unit, Wukong Business Unit, and AI Innovation Business Unit.
Meanwhile, Qwen technical lead Lin Junyang recently announced his surprise departure, marking the third senior executive exit from the AI unit in 2026, with key team members following him. The departures were sparked by an internal shakeup that placed a Google Gemini recruit in charge. Alibaba has since hired former Google researcher Zhou Hao as a direct replacement, and CEO Wu announced a dedicated task force to stabilize the AI division’s leadership.
Furthermore, according to Benzinga’s market analysis, over 80% of Alibaba’s open positions are now AI-related, up from 60% a year ago, spanning 16 business units including Alibaba Cloud and chip arm T-Head. Seven new AI-focused roles have appeared in campus recruitment, including positions focused on agentic AI.
Per earlier company disclosures, Alibaba has committed more than $53 billion toward infrastructure and AI development, underscoring the scale of the company’s bet on AI as its next growth engine.
Notably, Alibaba did not reference domestic competitor DeepSeek in its Qwen 3.5 announcement, comparing performance only against prior Qwen iterations and US-made models from Google and OpenAI. Following DeepSeek’s viral rise in early 2025, Alibaba released Qwen 2.5-Max in January 2025, and the two companies have since competed for developer mindshare across the Chinese open-source AI market.
“Recent advances in open source had significantly narrowed the performance gap with leading closed models.”
Matt Madrigal, CTO of Pinterest (via Forbes report on open-source AI adoption)
Madrigal’s assessment captures the tension at the heart of Alibaba’s decision. Alibaba built an ecosystem valued precisely because it was open, then chose to restrict its flagship multimodal model. Developers can access Qwen3.5-Omni via Alibaba Cloud API, Qwen Chat, and a Hugging Face demo, but not as downloadable weights, the format that enabled the community variations and enterprise deployments that defined Qwen’s rise.

