raalr/qwen3-1.7b-arabic-standard-kd-500k-run1
The raalr/qwen3-1.7b-arabic-standard-kd-500k-run1 model is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model is specifically fine-tuned for Arabic language tasks, demonstrating a validation loss of 1.2204. It is optimized for general Arabic text generation and understanding, making it suitable for applications requiring a compact yet capable Arabic LLM.
Loading preview...
Overview
This model, raalr/qwen3-1.7b-arabic-standard-kd-500k-run1, is a fine-tuned variant of the Qwen3-1.7B-Base architecture. It features approximately 2 billion parameters and has been specialized through a fine-tuning process, achieving a final validation loss of 1.2204.
Training Details
The model was trained with a learning rate of 2e-05 over 3 epochs, utilizing a cosine learning rate scheduler with 0.05 warmup steps. The training involved a total batch size of 16 (2 per device with 8 gradient accumulation steps). The training process showed a consistent decrease in validation loss, stabilizing around 1.2204 by the end of the 3rd epoch.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen3-1.7B-Base.
- Parameter Count: Approximately 2 billion parameters.
- Context Length: Supports a context length of 32768 tokens.
- Performance: Achieved a validation loss of 1.2204, indicating its fine-tuning effectiveness.
Potential Use Cases
Given its base architecture and fine-tuning, this model is likely suitable for:
- General Arabic text generation.
- Arabic language understanding tasks.
- Applications where a smaller, efficient Arabic LLM is preferred.