RefinedNeuro/RefinedToolCallV5-3b
RefinedNeuro/RefinedToolCallV5-3b is a 3.1 billion parameter model built on WeiboAI/VibeThinker-3B, specifically optimized for multi-turn agentic tool calling and mathematical reasoning. It demonstrates significantly improved stateful, multi-step tool use and maintains strong reasoning capabilities, achieving 0.933 on AIME-2024 pass@8. This 32K context length model is designed for local, offline agentic tool-use prototypes and multi-step function-calling assistants.
Loading preview...
RefinedToolCall-V5-3B Overview
RefinedNeuro/RefinedToolCallV5-3b is a 3.1 billion parameter model, developed by RefinedNeuro, that excels in multi-turn agentic tool calling and mathematical reasoning. Unlike many smaller models that struggle with complex, multi-step interactions, this model is specifically engineered to maintain coherence and effectiveness across several turns, showing a ~3.7x improvement in multi-turn stateful tool-use on the Berkeley Function-Calling Leaderboard (multi_turn).
Key Capabilities
- Enhanced Multi-turn Agentic Behavior: Achieves 0.220 average / 0.298 pass@3 on BFCL
multi_turn, demonstrating robust multi-step tool-use. - Sharp Single-turn Function Calling: Scores 0.707 on BFCL single-turn (held-out).
- Tool Error Recovery: Boasts a 0.896 recovery rate, allowing it to diagnose and recover from tool failures.
- Intact Reasoning: Maintains strong mathematical reasoning with 0.933 on AIME-2024 pass@8, indicating no degradation from tool training.
- Compact & Local: A 3B parameter model (2.5 GB Q6_K) designed to run efficiently on laptops via Ollama, without requiring a dedicated GPU.
- Apache-2.0 Licensed: Freely available for use, shipping, and fine-tuning.
How it Achieved This
The model's capabilities were developed through five disciplined fine-tuning rounds, including a breakthrough in on-policy self-improvement where the model learned from its own successful multi-turn solutions. This process ensured that reasoning and recovery capabilities were never regressed.
Good For
- Local/offline agentic tool-use prototypes
- Multi-step function-calling assistants
- Math & STEM reasoning tasks
- Learning about the construction of small agentic models