Name: Aznaur/tbench-qwen-sft-multitask-nat-v8 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Aznaur

Model Overview

Aznaur/tbench-qwen-sft-multitask-nat-v8 is an 8 billion parameter model built upon the Qwen3-8B architecture. It has been fine-tuned using an advanced technique called Enhanced Negative-Aware Training (NAT v8). This training methodology focuses on teaching the model to recognize and avoid common failure patterns, significantly improving its reliability in agentic tasks.

Key Capabilities & Training

The model was trained for 200 epochs on 5 specific terminal benchmark tasks: fix-git, log-summary-date-ranges, pypi-server, regex-log, and cancel-async-tasks. A crucial aspect of its training is the inclusion of negative examples (10 per epoch, 2 per task) alongside positive ones. These negative examples are designed to address specific anti-patterns:

Hallucinated arguments: Prevents the model from generating non-existent or incorrect arguments.
Looping behavior: Mitigates repetitive command execution after task completion.
Wrong command format: Ensures correct syntax and usage of commands.
Task-specific failures: Addresses customized negative patterns relevant to each of the 5 training tasks.

This approach, inspired by the paper "Learning From Failure: Integrating Negative Examples when Fine-tuning Large Language Models as Agents" (arXiv 2402.11651), aims to create a more robust and error-resistant model for automated task execution.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

Automated terminal command execution.
Agentic systems that need to interact with command-line interfaces.
Reducing common LLM errors such as hallucination and repetitive actions in task-oriented scenarios.

Overview

Model Overview

Key Capabilities & Training

Ideal Use Cases

Full Model Card (README)