SeongryongJung/qwen2.5-1.5b-ifeval-halfepoch-sft
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 16, 2026Architecture:Transformer Cold
SeongryongJung/qwen2.5-1.5b-ifeval-halfepoch-sft is a 1.5 billion parameter language model, fine-tuned from Qwen2.5-1.5B-Instruct. This model was specifically trained on the IFEvalSFTDataset for an effective half-epoch, focusing on instruction following. It is designed to enhance performance in instruction-based tasks, leveraging its 32768 token context length.
Loading preview...
Overview
This model, SeongryongJung/qwen2.5-1.5b-ifeval-halfepoch-sft, is a specialized variant of the Qwen2.5-1.5B-Instruct architecture. It features 1.5 billion parameters and a substantial 32768 token context length, making it suitable for processing longer inputs and complex instructions.
Key Capabilities
- Instruction Following: The model has undergone specific fine-tuning on the
IFEvalSFTDataset. This training regimen, conducted for an effective half-epoch, aims to improve its ability to accurately understand and execute instructions. - Efficient Training: The fine-tuning process involved 4064 training datapoints, a single epoch, a learning rate of 1e-4, and a total batch size of 64, indicating a focused and efficient training approach for instruction-based tasks.
Good For
- Instruction-tuned applications: Ideal for scenarios where precise adherence to given instructions is critical.
- Research and experimentation: Useful for developers and researchers exploring the impact of targeted instruction-following fine-tuning on smaller, yet capable, language models.