Name: raalr/qwen3-1.7b-arabic-standard-kd API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: raalr

Model Overview

The raalr/qwen3-1.7b-arabic-standard-kd is a 2 billion parameter language model derived from the Qwen3-1.7B-Base architecture. This model has undergone fine-tuning, though the specific dataset used for this process is not detailed in the available information. It is designed to handle standard Arabic text, likely benefiting from knowledge distillation (kd) to maintain performance in a smaller footprint.

Training Details

The model was trained using a learning rate of 2e-05 over 3 epochs, with a total training batch size of 16 (achieved with a train_batch_size of 2 and gradient_accumulation_steps of 8). The optimizer used was ADAMW_TORCH with standard betas and epsilon. A cosine learning rate scheduler with 0.05 warmup steps was employed. During training, the validation loss decreased from an initial 2.6974 to a final reported loss of 2.0547.

Framework Versions

The training process utilized:

Transformers 5.4.0
Pytorch 2.10.0+cu128
Datasets 4.8.4
Tokenizers 0.22.2

Intended Uses & Limitations

While specific intended uses and limitations are not explicitly detailed, given its base model and fine-tuning, it is likely suitable for various Arabic natural language processing tasks. Users should be aware that detailed information regarding its specific capabilities, performance benchmarks, and potential biases or limitations is not provided in the current model card.

Overview

Model Overview

Training Details

Framework Versions

Intended Uses & Limitations

Full Model Card (README)