seanpoyner/smolcode-coder-1.5b-tools

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 15, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

seanpoyner/smolcode-coder-1.5b-tools is a 1.5 billion parameter LoRA fine-tune of Qwen2.5-Coder-1.5B-Instruct, specifically designed to emit native function calls. This model excels at driving agentic write-run-fix-verify loops by enabling proper tool-use parsing in runtimes like Ollama. It features a 32768 token context length and achieves 100% native emission, making it suitable for small, efficient coding agents.

Loading preview...

Model Overview

smolcode-coder-1.5b-tools is a 1.5 billion parameter LoRA fine-tune of the Qwen2.5-Coder-1.5B-Instruct model, developed by seanpoyner. Its primary innovation is enabling the model to emit native <tool_call> function calls (token ID 151657), which is crucial for agentic workflows where runtimes like Ollama or llama.cpp parse these into OpenAI-style tool_calls.

Key Capabilities & Differentiators

  • Native Tool Call Emission: Achieves a 100% native tool-call rate on held-out prompts, a significant improvement over the base model which typically describes tool calls as plain text JSON.
  • Agentic Loop Support: Specifically designed to drive agentic write → run → fix → verify loops, as demonstrated in the smolcode project.
  • Efficient Agent Performance: Despite its small size (1.5B parameters), it performs comparably to larger 3B models in agentic benchmarks, solving 7 out of 10 tasks entirely on its own in the smolcode pass@1 evaluation.
  • Specialized Training: Fine-tuned using NousResearch/hermes-function-calling-v1 and synthetic smolcode tool-use trajectories, with full training of embed_tokens and lm_head to correctly output the <tool_call> special token.

Usage Notes

  • Serving: Recommended to serve via the provided GGUF (smolcode-1.5b-q4_k_m.gguf) for compatibility with Ollama, as direct safetensors import can cause issues.
  • repeat_penalty: It is critical to set repeat_penalty (or repetition_penalty) to 1.0 to ensure the model correctly emits the <tool_call> token, as the system prompt contains this token.