Name: QpiEImitation/opd_gsm8k_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: QpiEImitation

Overview

This model, opd_gsm8k_S-Qwen2.5-3B-Instruct_T-Qwen2-7B-Instruct, is a 3.1 billion parameter instruction-tuned language model. It is a fine-tuned version of the base model Qwen/Qwen2.5-3B-Instruct.

Key Capabilities

Instruction Following: The model has been specifically fine-tuned to follow instructions effectively, building upon the capabilities of its base Qwen2.5-3B-Instruct architecture.
GKD Training Method: It was trained using the GKD (On-Policy Distillation of Language Models) method, as detailed in the paper "On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes". This approach allows the model to learn and improve by analyzing its own generated errors.
TRL Framework: The training process leveraged the TRL (Transformers Reinforcement Learning) library, indicating a focus on advanced fine-tuning techniques.

Good For

General Instruction-Following: Ideal for applications requiring a model to accurately interpret and respond to user instructions.
Research in Distillation: Useful for researchers exploring on-policy distillation methods and their impact on language model performance.
Efficient Deployment: As a 3.1B parameter model, it offers a balance between performance and computational efficiency, making it suitable for scenarios where larger models might be too resource-intensive.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)