The beyoru/Qwen3-4B-I-1209 is a 4 billion parameter instruction-tuned causal language model, based on the Qwen3-4B-Instruct-2507 architecture. Developed by Beyoru, this model is specifically fine-tuned using Reinforcement Learning with GRPO and multiple reward functions to excel in tool-use and function call generation. It achieves an overall accuracy of 0.7233 on the ACEBench, outperforming its base model and Salesforce/Llama-xLAM-2-8b-fc-r.
No reviews yet. Be the first to review!