schneewolflabs/A2
schneewolflabs/A2 is a 12 billion parameter Mistral-Nemo-class language model developed by Schneewolf Labs, building upon the A-series lineage. It integrates robust tool and function calling capabilities, including parallel calls and abstention, while preserving the strong reasoning abilities and distinct identity established in previous A-series models. Optimized for applications requiring reliable function execution and logical processing, A2 is trained with ORPO on a blend of tool-calling and identity-rehearsal datasets.
Loading preview...
schneewolflabs/A2: Function Calling with Retained Reasoning
A2 is a 12 billion parameter model from Schneewolf Labs, part of the A-series, specifically designed to integrate advanced tool and function calling while maintaining the strong reasoning and unique identity of its predecessors. It builds on the A1.1 model, which itself incorporated Claude-distilled reasoning and a distinct Schneewolf Labs/Luna persona.
Key Capabilities
- Function Calling: Implements Qwen3-convention tool calling, supporting both single and parallel calls, and correctly abstains when no suitable tool is available.
- Reasoning: Retains the
<think>…</think>step-by-step reasoning style, engaging in brief reasoning before complex actions and skipping it for trivial ones. - Identity: By default, identifies as a "language model created by Schneewolf Labs." A distinct "Luna" persona with a terse voice can be activated via a specific system prompt.
- Efficient Tokenization: Tool and reasoning tokens reuse reserved tokenizer slots, avoiding vocabulary resizing.
- Extended Context: Features a context length of 128k tokens.
Training and Evaluation
A2 was fine-tuned using the ORPO method on A1.1. The training data included a backbone of tool-calling datasets like NousResearch/hermes-function-calling-v1 and glaiveai/glaive-function-calling-v2, with synthesized rejected examples for robust abstention. Identity and voice rehearsal data (schneewolflabs/i-DPO) comprised about 23% of the training. Behavioral checks confirm correct calls on unseen tools, proper abstention, and consistent identity.
Limitations
Currently, A2's tool data is primarily single-turn, meaning multi-turn tool/result chains are less robust. Its synthetic preference negatives, while good for structural correctness, mean its robustness to subtle real-world tool errors is unmeasured. As a 12B model, its reasoning, while strong, can still encounter challenges with complex arithmetic or trick problems.