emrecanacikgoz/Qwen2.5-7B-Instruct-ToolRL-grpo-cold
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 22, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The emrecanacikgoz/Qwen2.5-7B-Instruct-ToolRL-grpo-cold model is a 7.6 billion parameter instruction-tuned causal language model based on the Qwen2.5 architecture. It is fine-tuned with ToolRL and grpo-cold methods, suggesting an optimization for tool-use capabilities and improved instruction following. This model is designed for tasks requiring precise instruction adherence and potential integration with external tools.

Loading preview...