astom-M/matsuo-llm-advanced-phase-f2a
The astom-M/matsuo-llm-advanced-phase-f2a is a 7.6 billion parameter language model fine-tuned from Qwen/Qwen2.5-7B-Instruct, specifically optimized for agent tasks, particularly those involving database interactions. It leverages a unique training data composition including Spider/BIRD, DBBench v4, ALFWorld, and newly introduced multi-turn distilled data generated by Qwen2.5-72B-Instruct-AWQ. This model excels at complex SQL generation and multi-turn database conversations, making it suitable for applications requiring robust database agent capabilities.
Loading preview...
Overview
matsuo-llm-advanced-phase-f2a is a 7.6 billion parameter model, fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model. Its primary focus is on enhancing performance for agent tasks, particularly those involving complex database interactions and multi-turn conversations. The model's training incorporates a unique blend of datasets, including a significant portion of newly generated multi-turn distilled data.
Key Capabilities
- Advanced SQL Generation: Fine-tuned on datasets like Spider/BIRD and DBBench v4, it demonstrates strong capabilities in generating accurate SQL queries from natural language.
- Multi-turn Database Conversations: A key differentiator is the inclusion of 14% distilled data, comprising multi-turn database conversations generated by the powerful Qwen2.5-72B-Instruct-AWQ model. This enhances its ability to handle complex, sequential interactions with databases.
- Agent Task Proficiency: Optimized for various agentic workflows, including those requiring reasoning over structured data and environmental interaction (e.g., ALFWorld).
Good for
- Database Agents: Ideal for building agents that can understand and execute complex queries, or engage in multi-turn dialogues to retrieve and manipulate database information.
- SQL Generation: Applications requiring highly accurate and context-aware SQL generation from natural language prompts.
- Complex Reasoning Tasks: Suitable for scenarios where an agent needs to reason over structured data and interact with environments based on textual instructions.