minpeter/Llama-3.2-1B-chatml-tool-v2
minpeter/Llama-3.2-1B-chatml-tool-v2 is a 1 billion parameter model based on the Llama-3.2 architecture, fine-tuned for tool-use capabilities. This version utilizes an AlternateTokenizer that correctly handles tool-related tokens, improving performance in generating function calls compared to its predecessor. It demonstrates strong performance in 'simple' and 'multiple' task categories, making it suitable for applications requiring structured tool interactions.
Loading preview...
Overview
minpeter/Llama-3.2-1B-chatml-tool-v2 is a 1 billion parameter language model built upon the Llama-3.2 architecture, specifically designed and fine-tuned for tool-use scenarios. This iteration, v2, addresses a key limitation found in v1 by incorporating an AlternateTokenizer that properly handles tool-related tokens such as <tools>, <tool_call>, and <tool_response>. This change significantly improves the model's ability to generate correct function calls, which was a challenge in previous versions where the <tool_call> tag was not consistently generated before the function call.
Key Capabilities
- Enhanced Tool-Use: Improved generation of tool-related tokens for more reliable function calling.
- Performance in Specific Tasks: Outperforms the base
meta-llama/Llama-3.2-1B-Instructmodel in 'simple' and 'multiple' task categories, achieving scores of 0.72 and 0.695 respectively, compared to 0.215 and 0.17. - Context Length: Supports a context length of 32768 tokens.
Good For
- Tool-Augmented Applications: Ideal for use cases where the model needs to interact with external tools or APIs by generating structured function calls.
- Structured Output Generation: Suitable for tasks requiring precise and correctly formatted outputs, especially when integrating with other systems.
- Research and Development: Provides a foundation for further exploration into tool-use capabilities in larger Llama-3.2 models (3B, 8B) to confirm the observed performance improvements.