Mistral Unveils Voxtral, Its Open-Source Bet to Rival OpenAI and ElevenLabs

French AI startup Mistral has released Voxtral, a new family of open-source speech understanding models designed to deliver production-ready transcription and semantic audio analysis at a fraction of the cost of proprietary alternatives, such as OpenAI Whisper and ElevenLabs Scribe.

The Voxtral models come in two variants: a 24B version for large-scale deployments and a 3B ‘Mini’ version for local or edge use. Both are available under the Apache 2.0 licence and can be downloaded via Hugging Face or accessed through Mistral’s API. A dedicated low-cost transcription endpoint is also available, priced at $0.001 per minute.

Designed to handle long-form audio with up to 32,000 tokens of context, Voxtral supports direct question answering and summarisation without chaining multiple models. It supports multiple languages and lets developers trigger actions directly from spoken prompts.

Benchmark results shared by Mistral show Voxtral outperforming Whisper Large V3, GPT-4o Mini Transcribe, and Gemini 2.5 Flash across a range of transcription and multilingual tasks, including FLEURS and Mozilla Common Voice. 

Image Credits: Mistral

The company claims state-of-the-art results in English and European languages, along with strong audio understanding and translation performance.

Image Credits: Mistral

Voxtral also retains the text processing capabilities of its Mistral Small 3.1 backbone, enabling seamless transitions between voice and language tasks. For enterprises, Mistral offers options for on-premises deployment, domain-specific fine-tuning, and extended features such as speaker ID, emotion detection, and diarization.

The models can be tested via Le Chat’s voice mode or integrated via API. A webinar with Inworld AI on August 6 aims to demonstrate end-to-end voice agent applications. 

Mistral revealed that it is actively hiring to expand its audio team as it pushes toward building “near-human-like voice interfaces”.

The company has also recently launched Magistral, its first reasoning-focused language model, in two versions—open-source Small and enterprise-grade Medium. Tuned for multi-step logic across domains like finance and healthcare, it supports multiple languages and delivers high benchmark scores.

The post Mistral Unveils Voxtral, Its Open-Source Bet to Rival OpenAI and ElevenLabs appeared first on Analytics India Magazine.

Scroll to Top