Name: ferrazzipietro/unsup-Qwen3-1.7B-datav3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ferrazzipietro

Model Overview

ferrazzipietro/unsup-Qwen3-1.7B-datav3 is a 2 billion parameter language model derived from the Qwen3-1.7B architecture. This model was fine-tuned by ferrazzipietro, though the specific dataset used for this process remains undisclosed. It was trained for a single epoch with a substantial context length of 32768 tokens.

Training Details

The training process involved specific hyperparameters:

Learning Rate: 0.0003
Batch Sizes: train_batch_size of 128, eval_batch_size of 16, and a total_train_batch_size of 512 (with 4 gradient accumulation steps).
Optimizer: ADAMW_TORCH with default betas and epsilon.
Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
Epochs: 1

During training, the model achieved a final validation loss of 0.2568. The training loss decreased from 4.2641 at 1000 steps to 1.8969 at 16000 steps, while the validation loss improved from 0.3924 to 0.2568.

Limitations

As noted in the original model card, more information is needed regarding the model's specific description, intended uses, and limitations. The dataset used for fine-tuning is also not specified, which limits understanding of its specialized capabilities.

Overview

Model Overview

Training Details

Limitations

Full Model Card (README)