raalr/qwen3-1.7b-arabic-standard-kd-500k-run1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Mar 30, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The raalr/qwen3-1.7b-arabic-standard-kd-500k-run1 model is a 2 billion parameter language model fine-tuned from Qwen/Qwen3-1.7B-Base. This model is specifically fine-tuned for Arabic language tasks, demonstrating a validation loss of 1.2204. It is optimized for general Arabic text generation and understanding, making it suitable for applications requiring a compact yet capable Arabic LLM.

Loading preview...

Overview

This model, raalr/qwen3-1.7b-arabic-standard-kd-500k-run1, is a fine-tuned variant of the Qwen3-1.7B-Base architecture. It features approximately 2 billion parameters and has been specialized through a fine-tuning process, achieving a final validation loss of 1.2204.

Training Details

The model was trained with a learning rate of 2e-05 over 3 epochs, utilizing a cosine learning rate scheduler with 0.05 warmup steps. The training involved a total batch size of 16 (2 per device with 8 gradient accumulation steps). The training process showed a consistent decrease in validation loss, stabilizing around 1.2204 by the end of the 3rd epoch.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen3-1.7B-Base.
  • Parameter Count: Approximately 2 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Performance: Achieved a validation loss of 1.2204, indicating its fine-tuning effectiveness.

Potential Use Cases

Given its base architecture and fine-tuning, this model is likely suitable for:

  • General Arabic text generation.
  • Arabic language understanding tasks.
  • Applications where a smaller, efficient Arabic LLM is preferred.