Name: aolans/Qwen2.5-7B-Instruct-SDFT-2ep-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aolans

Model Overview

This model, aolans/Qwen2.5-7B-Instruct-SDFT-2ep-fp16, is a fine-tuned version of the Qwen/Qwen2.5-7B-Instruct base model, developed by aolans. It has been trained using LoRA and Unsloth, with the adapter weights merged into the base model, and is provided in fp16 precision for direct loading.

Key Capabilities & Training Focus

The primary objective of this model's training was to enhance its performance on multi-turn agent tasks. Specifically, it demonstrates improved capabilities in environments such as ALFWorld (household tasks) and DBBench (database operations). The training methodology applied loss to all assistant turns within multi-turn trajectories, enabling the model to learn crucial aspects like environment observation, action selection, tool use, and error recovery.

Experimental Features

This iteration incorporates experimental training techniques: SDFT (Self-Distillation Enables Continual Learning) and Epiplexity (Rethinking Information for Computationally Bounded Intelligence). While these methods are still under evaluation and refinement, they aim to improve the model's reasoning capabilities. The model was trained for 2 epochs with a maximum sequence length of 4096 tokens.

Good For

Agentic workflows: Particularly suited for tasks requiring sequential decision-making and interaction with environments.
Multi-turn task execution: Excels in scenarios where the model needs to process and respond across multiple conversational turns or steps.
Research into experimental training methods: Provides a practical application of SDFT and Epiplexity for further study.

Overview

Model Overview

Key Capabilities & Training Focus

Experimental Features

Good For

Full Model Card (README)