Alibaba, on Wednesday, added voice and video chat capabilities to Qwen Chat, besides releasing its brand new open-source model, Qwen2.5-Omni-7B, which made this possible. It was released as an open-source model under Apache 2.0 licence.
The company highlighted in a blog post that Qwen2.5-Omni is the new flagship end-to-end multimodal model in the Qwen series. It stated that it is designed for multimodal perception and seamlessly processes text, images, audio, and video, delivering real-time streaming responses via text and speech synthesis.
The key features of the model include a ‘Thinker-Talker’ architecture, which allows it to provide real-time responses. The Thinker part of the architecture is a Transformer decoder, which acts like the brain and the Talker, designed as a dual-track autoregressive Transformer decoder, operates like the human mouth.
Alibaba’s Qwen2.5-Omni model has shown strong performance across various tasks, including speech recognition, translation, audio and video understanding, and speech generation, outperforming similar models at tasks that require multiple modalities.
It was compared to similar single-modality and closed-source models like Qwen2.5-VL-7B, Qwen2-Audio, and Gemini-1.5-pro, achieving state-of-the-art performance.
Voice Chat + Video Chat! Just in Qwen Chat (https://t.co/FmQ0B9tiE7)! You can now chat with Qwen just like making a phone call or making a video call! Check the demo in https://t.co/42iDe4j1Hs
What’s more, we opensource the model behind all this, Qwen2.5-Omni-7B, under the… pic.twitter.com/LHQOQrl9Ha
— Qwen (@Alibaba_Qwen) March 26, 2025
The paper and code for the new model can be found on GitHub, while the AI model is available on Hugging Face along with a demo.
Last month, Alibaba also launched QwQ-Max-Preview, a new AI reasoning model within the Qwen family that specialises in mathematics and coding tasks and features a “thinking” capability in the Qwen Chat application.
The model, which outperformed OpenAI’s models on the LiveCodeBench leaderboard, is expected to have smaller variants open-sourced for local device deployment, as well as a dedicated mobile app.
There may be a lot more coming, considering Alibaba’s commitment to investing over $52 billion in AI over the next three years.
The post Alibaba Releases Qwen2.5 Omni, Adds Voice and Video Modes to Qwen Chat appeared first on Analytics India Magazine.