Name: kamaboko2007/llm_advance_024_enhanced_rules API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kamaboko2007

Overview

This model, kamaboko2007/llm_advance_024_enhanced_rules, is a 4 billion parameter Qwen3-4B-Instruct-2507 based model specifically fine-tuned to achieve high performance on AgentBench tasks, particularly ALFWorld and DBBench. It tackles common issues in multi-task agent fine-tuning, such as catastrophic forgetting and format-collision, through an innovative approach.

Key Innovation: Jinja2 Contextual Routing & Heuristics Injection

The core differentiator of this model lies in its custom tokenizer_config.json, which leverages Jinja2 to dynamically modify the chat_template. This system acts as an "Absolute Defense Shield" and "Dynamic Heuristics Injector," intercepting user prompts and injecting task-specific system prompts or "Cheat Sheets" just before inference. This allows the model to adapt its behavior based on the detected task.

Task-Specific Enhancements:

DB Bench (MySQL) Mode: Automatically detects MySQL or SQL in prompts and injects rules for error recovery (e.g., using DESCRIBE table_name; on SQL errors) and loop prevention.
ALFWorld (Household) Mode: Detects household or Interact with a and enforces a stable Think:/Act: format, overriding evaluation system traps. It also injects exploration logic, such as analyzing failed actions and avoiding re-searching empty receptacles.

Training Configuration

The model was trained using LoRA on a highly curated "Golden Ratio" dataset consisting of 494 high-quality ALFWorld v5 Trajectories and DBBench Distilled trajectories. Loss was applied strictly to all assistant turns in multi-turn trajectories. Key hyperparameters included a max sequence length of 8192, 2 epochs, and a learning rate of 1e-6.

Usage

To leverage the model's unique capabilities, it is critical to use its customized tokenizer, as the Jinja2 chat_template is integral to its dynamic behavior.

Overview

Overview

Key Innovation: Jinja2 Contextual Routing & Heuristics Injection

Task-Specific Enhancements:

Training Configuration

Usage

Full Model Card (README)