Name: yakazimir/simpo-exps_qwen05b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: yakazimir

Overview

The yakazimir/simpo-exps_qwen05b model is a fine-tuned variant of the trl-lib/qwen1.5-0.5b-sft base model, featuring approximately 0.6 billion parameters. It underwent a specific training regimen, although the dataset used for fine-tuning is not specified. The training process involved a learning rate of 8e-08, a batch size of 1, and 16 gradient accumulation steps, totaling 60 training steps.

Key Capabilities

Fine-tuned Base Model: Built upon the trl-lib/qwen1.5-0.5b-sft architecture.
Evaluation Metrics: Achieved a loss of 0.7797 and a rewards accuracy of 0.5230 on its evaluation set, with a rewards margin of 0.0860.
Training Configuration: Utilized an Adam optimizer with specific beta and epsilon values, and a cosine learning rate scheduler with a 0.1 warmup ratio.

Good for

Research into Fine-tuning: Potentially useful for researchers studying the effects of specific fine-tuning parameters on Qwen1.5-0.5B models, given the detailed training hyperparameters.
Baseline Comparisons: Can serve as a baseline for comparing performance against other fine-tuned models in the 0.6B parameter class, particularly when evaluating reward-based metrics.

Further details regarding its intended uses, limitations, and the specific training and evaluation data are not provided in the current model description.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)