Overview
minpeter/Llama-3.2-1B-chatml-tool-v2 is a 1 billion parameter language model built upon the Llama-3.2 architecture, specifically designed and fine-tuned for tool-use scenarios. This iteration, v2, addresses a key limitation found in v1 by incorporating an AlternateTokenizer that properly handles tool-related tokens such as <tools>, <tool_call>, and <tool_response>. This change significantly improves the model's ability to generate correct function calls, which was a challenge in previous versions where the <tool_call> tag was not consistently generated before the function call.
Key Capabilities
- Enhanced Tool-Use: Improved generation of tool-related tokens for more reliable function calling.
- Performance in Specific Tasks: Outperforms the base
meta-llama/Llama-3.2-1B-Instruct model in 'simple' and 'multiple' task categories, achieving scores of 0.72 and 0.695 respectively, compared to 0.215 and 0.17. - Context Length: Supports a context length of 32768 tokens.
Good For
- Tool-Augmented Applications: Ideal for use cases where the model needs to interact with external tools or APIs by generating structured function calls.
- Structured Output Generation: Suitable for tasks requiring precise and correctly formatted outputs, especially when integrating with other systems.
- Research and Development: Provides a foundation for further exploration into tool-use capabilities in larger Llama-3.2 models (3B, 8B) to confirm the observed performance improvements.