The Aznaur/tbench-qwen-sft-multitask-nat-v8 is an 8 billion parameter Qwen3-based language model fine-tuned using an enhanced Negative-Aware Training (NAT v8) method. This model specializes in improving tool usage and avoiding task-specific failure patterns across multiple terminal benchmark tasks. It is particularly optimized to prevent common LLM issues like hallucinated arguments, looping behavior, and incorrect command formats, making it suitable for agentic applications requiring robust command execution.
Loading preview...
Model Overview
Aznaur/tbench-qwen-sft-multitask-nat-v8 is an 8 billion parameter model built upon the Qwen3-8B architecture. It has been fine-tuned using an advanced technique called Enhanced Negative-Aware Training (NAT v8). This training methodology focuses on teaching the model to recognize and avoid common failure patterns, significantly improving its reliability in agentic tasks.
Key Capabilities & Training
The model was trained for 200 epochs on 5 specific terminal benchmark tasks: fix-git, log-summary-date-ranges, pypi-server, regex-log, and cancel-async-tasks. A crucial aspect of its training is the inclusion of negative examples (10 per epoch, 2 per task) alongside positive ones. These negative examples are designed to address specific anti-patterns:
- Hallucinated arguments: Prevents the model from generating non-existent or incorrect arguments.
- Looping behavior: Mitigates repetitive command execution after task completion.
- Wrong command format: Ensures correct syntax and usage of commands.
- Task-specific failures: Addresses customized negative patterns relevant to each of the 5 training tasks.
This approach, inspired by the paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents" (arXiv 2402.11651), aims to create a more robust and error-resistant model for automated task execution.
Ideal Use Cases
This model is particularly well-suited for applications requiring:
- Automated terminal command execution.
- Agentic systems that need to interact with command-line interfaces.
- Reducing common LLM errors such as hallucination and repetitive actions in task-oriented scenarios.