Model Overview
minpeter/Llama-3.2-1B-chatml-tool-v4 is a 1 billion parameter language model built upon the Llama-3.2 architecture, specifically fine-tuned for tool-use scenarios using the ChatML format. It features a substantial context window of 32768 tokens, enabling it to handle complex multi-turn interactions and detailed function calls.
Key Capabilities & Performance
This model shows notable performance improvements over the base meta-llama/Llama-3.2-1B-Instruct model in specific tool-use benchmarks, particularly in 'simple' and 'multiple' task categories. For instance, it achieves a score of 0.7725 on 'simple' tasks and 0.765 on 'multiple' tasks, significantly surpassing the base model's scores of 0.215 and 0.17 respectively. While current versions do not account for parallel calls, future updates aim to address this.
Good For
- Function Calling: Excels in scenarios requiring the model to identify and execute specific tools or functions based on user prompts.
- Structured Interactions: Ideal for applications where the model needs to generate structured outputs or interact with external systems.
- Efficient Tool-Use: Offers a compact 1B parameter size combined with strong performance in targeted tool-use tasks, making it suitable for resource-constrained environments.