seanpoyner/smolcode-coder-1.5b-tools
seanpoyner/smolcode-coder-1.5b-tools is a 1.5 billion parameter LoRA fine-tune of Qwen2.5-Coder-1.5B-Instruct, specifically designed to emit native function calls. This model excels at driving agentic write-run-fix-verify loops by enabling proper tool-use parsing in runtimes like Ollama. It features a 32768 token context length and achieves 100% native emission, making it suitable for small, efficient coding agents.
Loading preview...
Model Overview
smolcode-coder-1.5b-tools is a 1.5 billion parameter LoRA fine-tune of the Qwen2.5-Coder-1.5B-Instruct model, developed by seanpoyner. Its primary innovation is enabling the model to emit native <tool_call> function calls (token ID 151657), which is crucial for agentic workflows where runtimes like Ollama or llama.cpp parse these into OpenAI-style tool_calls.
Key Capabilities & Differentiators
- Native Tool Call Emission: Achieves a 100% native tool-call rate on held-out prompts, a significant improvement over the base model which typically describes tool calls as plain text JSON.
- Agentic Loop Support: Specifically designed to drive agentic write → run → fix → verify loops, as demonstrated in the smolcode project.
- Efficient Agent Performance: Despite its small size (1.5B parameters), it performs comparably to larger 3B models in agentic benchmarks, solving 7 out of 10 tasks entirely on its own in the smolcode pass@1 evaluation.
- Specialized Training: Fine-tuned using
NousResearch/hermes-function-calling-v1and synthetic smolcode tool-use trajectories, with full training ofembed_tokensandlm_headto correctly output the<tool_call>special token.
Usage Notes
- Serving: Recommended to serve via the provided GGUF (
smolcode-1.5b-q4_k_m.gguf) for compatibility with Ollama, as directsafetensorsimport can cause issues. repeat_penalty: It is critical to setrepeat_penalty(orrepetition_penalty) to1.0to ensure the model correctly emits the<tool_call>token, as the system prompt contains this token.