Model Overview
Sakai0920/LLM-Advanced-Competition-2025 is a 7.6 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-7B-Instruct base model using 16-bit (BF16) precision. Its primary objective is to significantly enhance ReAct-style agent performance across specific domains.
Key Capabilities
- Optimized for Agentic Workflows: Specifically trained to improve the effectiveness of ReAct-style agents.
- Domain Specialization: Excels in tasks related to household environments (ALFWorld) and database operations (DBBench).
- Advanced Training Data: Utilizes a diverse training dataset including curated trajectories, distilled data from the more powerful Qwen/Qwen3-32B, and augmented data designed to address common failure patterns.
Training Details
The model underwent 2 epochs of training on an A100 80GB GPU. The training data comprised over 5,300 entries, including specialized datasets for ALFWorld trajectories, DBBench ReAct, and various augmented data for recovery loop avoidance and no-examine scenarios.
Good For
- Developing and deploying agents that require robust ReAct-style reasoning.
- Applications involving automated task execution in simulated household environments.
- Systems needing efficient and accurate database interaction through natural language commands.