Name: aolans/Qwen2.5-7B-Instruct-SDFT-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aolans

Model Overview

aolans/Qwen2.5-7B-Instruct-SDFT-fp16 is a 7.6 billion parameter instruction-tuned model derived from Qwen/Qwen2.5-7B-Instruct. It has been fine-tuned using LoRA (merged into the base model) and is provided in fp16 precision, allowing for direct loading without separate adapter management.

Key Capabilities & Training Focus

This model is specifically trained to improve multi-turn agent task performance, with a strong emphasis on:

ALFWorld (household tasks): Enhancing the model's ability to navigate and complete tasks in simulated household environments.
DBBench (database operations): Improving proficiency in executing and managing database-related operations.

The training objective applies loss to all assistant turns in a multi-turn trajectory, enabling the model to learn from environment observations, select appropriate actions, utilize tools effectively, and recover from errors.

Experimental Features

This version incorporates experimental training techniques, SDFT (Self-Distillation Enables Continual Learning) and Epiplexity (Rethinking Information for Computationally Bounded Intelligence). While these methods are still under evaluation and refinement, they aim to further enhance the model's reasoning capabilities. The training utilized a maximum sequence length of 4096 tokens over 2 epochs with a learning rate of 2e-06.

Usage

The model can be loaded directly using AutoModelForCausalLM from the Hugging Face transformers library, leveraging torch.float16 for efficient inference.

Overview

Model Overview

Key Capabilities & Training Focus

Experimental Features

Usage

Full Model Card (README)