Model Overview
BRlkl/distill-sft-qwen3-4b-full is a 4 billion parameter language model, derived from the unsloth/Qwen3-4B-Instruct-2507 base model. It has been specifically fine-tuned using Supervised Fine-Tuning (SFT) techniques with the TRL library, aiming to improve its instruction-following capabilities.
Key Capabilities
- Instruction Following: Optimized through SFT to better understand and respond to user instructions.
- Text Generation: Capable of generating coherent and contextually relevant text based on prompts.
- Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and maintaining conversational history.
Training Details
The model's training procedure involved Supervised Fine-Tuning (SFT) utilizing the TRL framework. This method focuses on training the model with high-quality instruction-response pairs to align its outputs with human preferences and instructions. The training process was tracked and can be visualized via Weights & Biases.
Use Cases
This model is suitable for a variety of natural language processing tasks where instruction-tuned performance is beneficial, including question answering, content creation, and conversational AI applications, particularly when a 4 billion parameter model with a large context window is desired.