Hugging Face’s Latest Small Language Model Adds Reasoning Capabilities

Hugging Face has released SmolLM3, a 3B parameter language model that offers long-context reasoning, multilingual capabilities, and dual-mode inference, making it one of the most competitive small-scale open models to date. The model is available under the Apache 2.0 license.

Trained on 11.2 trillion tokens, SmolLM3 outperforms other models in its class, including Llama-3.2-3B and Qwen2.5-3B, while rivalling larger 4B models such as Gemma3 and Qwen3.

The model supports six languages, including English, French, Spanish, German, Italian, and Portuguese, and can process context lengths of up to 128k tokens, enabled by NoPE and YaRN techniques.

The release includes both a base model and an instruction-tuned model with dual reasoning modes. Users can toggle between different flags to control whether the model generates answers with or without reasoning traces.

Pretraining was conducted over three stages with evolving mixes of web, code, and math datasets. A mid-training phase extended the model’s context length and added general reasoning capabilities, followed by supervised fine-tuning and preference alignment using Anchored Preference Optimisation (APO).

SmolLM3 achieved strong results across 12 benchmarks, ranking high on knowledge and reasoning tasks and demonstrating strong multilingual and coding performance. Instructing and reasoning modes yielded further gains on tasks like LiveCodeBench and AIME 2025.

The full training recipe, including data mixtures, ablations, synthetic data generation, and model alignment steps, has also been made public on its GitHub and Hugging Face pages. This open approach aims to help the research community replicate and build on SmolLM3’s performance.

A few months back, Hugging Face launched SmolLM2, an open-source small language model trained on 11 trillion tokens, including custom datasets for math, code, and instruction-following. It outperforms models like Qwen2.5-1.5B and Llama3.2-1B on several benchmarks, particularly MMLU-Pro, while achieving competitive results on others, like TriviaQA and Natural Questions.

It appears that Hugging Face is focusing on minor but consistent improvements for its small language models.

The post Hugging Face’s Latest Small Language Model Adds Reasoning Capabilities appeared first on Analytics India Magazine.

Related Posts