Model Overview
The lightblue/DeepSeek-R1-Distill-Qwen-1.5B-Multilingual is a 1.5 billion parameter language model developed by Lightblue, based on the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B architecture. Its core innovation lies in its multilingual Chain-of-Thought (CoT) fine-tuning, enabling it to process and generate responses, including its internal 'thought' process, in the language of the prompt. This contrasts with the original R1 models, which often default to Chinese or English for internal reasoning.
Key Capabilities
- Multilingual CoT Reasoning: The model is specifically trained to perform Chain-of-Thought reasoning and generate responses entirely within the user's language, enhancing explainability and understandability for diverse linguistic contexts.
- Broad Language Support: While performing well across many languages, it shows stronger performance in higher-resource languages such as Japanese, English, German, Arabic, and French, compared to lower-resource languages like Amharic or Lao.
- Extended Context Window: Features a substantial context length of 131072 tokens, allowing for processing longer inputs and maintaining conversational coherence.
Usage Recommendations
- Sampling Parameters: It is recommended to use a sampling temperature between 0.5 and 0.7 for optimal output quality.
- Repetition Penalty: For niche languages where repetition may occur, setting a
repetition_penalty of 1.1 or higher is advised.
Evaluation Highlights
Preliminary evaluations indicate that the model reliably produces correct answers and maintains the correct language for both its thinking process and final response across a variety of languages. For instance, it achieved high scores (>=0.8) in correctly formatted and accurate results for languages like English, German, Japanese, and Korean in a quick 5-question evaluation.