Matter-0.1-Slim-7B-C-DPO Overview
Matter-0.1-Slim-7B-C-DPO is a 7 billion parameter model based on the Mistral 7B architecture, developed by 0-hero. It has undergone a continuous full-finetuning process using Direct Preference Optimization (DPO) on the Matter-0.1-Slim-C dataset. This dataset is a comprehensive collection, curated from over 35 distinct datasets and encompassing more than 6 billion tokens.
Key Capabilities
- Function Calling: The model is explicitly designed to support function calling, allowing it to interact with external tools and APIs. It uses specific tokens (
<|begin_func|>, <|end_func|>, <|begin_func_response|>, <|end_func_response|>) to delineate function calls and their responses within the conversation. - Instruction Following: Optimized through DPO, the model excels at understanding and executing complex instructions, particularly in scenarios where tool use is beneficial.
- ChatML Format: It utilizes the ChatML prompt format, ensuring compatibility with common chat-based interfaces and structured conversations.
Training Details
The model was trained for approximately 17 hours over 3 epochs on 4x A100 GPUs (80GB each) using the Axolotl framework for full-finetuning. This rigorous training on a diverse and extensive dataset contributes to its robust performance in instruction following and function calling.
Good For
- Applications requiring tool use and function calling capabilities.
- Building intelligent agents that can interact with external systems.
- Scenarios demanding strong instruction following in a chat-based context.