niklasm222/qwen2.5-3b-inst-grpo-1.75k-gsm8k-sp_struct-rwd1-v4.2
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 12, 2025License:apache-2.0Architecture:Transformer Open Weights Warm
The niklasm222/qwen2.5-3b-inst-grpo-1.75k-gsm8k-sp_struct-rwd1-v4.2 is a Qwen2.5-3B-Instruct model, developed by niklasm222, fine-tuned from unsloth/qwen2.5-3b-instruct-unsloth-bnb-4bit. This model was trained using Unsloth and Huggingface's TRL library, resulting in a 2x faster training process. It is designed for general instruction-following tasks, leveraging its Qwen2.5 architecture for efficient performance.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–