Name: ichi234/exp002_stage2_s2_db_merged API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ichi234

Model Overview

ichi234/exp002_stage2_s2_db_merged is a 7.6 billion parameter language model derived from Qwen/Qwen2.5-7B-Instruct. It has been specifically fine-tuned to excel in advanced competitive environments, particularly ALFWorld and DBBench tasks.

Key Capabilities

ALFWorld Optimization: Stabilizes THOUGHT + ACTION two-line output format and maintains a high rate of legal actions within ALFWorld environments.
DBBench Enhancement: Improves the stability of Action: Operation / Action: Answer formats and enhances the consistency of generated SQL queries and answers for DBBench.
Multi-stage Fine-tuning: Utilizes a multi-phase LoRA (bfloat16) training strategy, with varying learning rates and epochs, to progressively refine performance across different aspects of the target tasks.
Data Augmentation: Incorporates offline distillation using openai/gpt-oss-120b to expand and enhance the training dataset, contributing to improved model robustness and accuracy.

Training Methodology

The model underwent a two-stage training process. Stage 1 focused on stabilizing ALFWorld output formats, followed by DBBench learning, using multiple LoRA phases with a maximum sequence length of 2048. Stage 2 leveraged data augmented via distillation from openai/gpt-oss-120b, training with a maximum sequence length of 4096 to further refine its specialized capabilities.

Good for

Developing agents for complex interactive environments like ALFWorld.
Applications requiring precise SQL generation and answer consistency in database interaction tasks (DBBench).
Research into multi-task learning and agentic behavior in structured environments.

Overview

Model Overview

Key Capabilities

Training Methodology

Good for

Full Model Card (README)