Name: chancharikm/all_sft_formats_balanced_human_only_20260222_1240_ep3_lr3e5_qwen3-vl-8b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: chancharikm

Model Overview

This model, chancharikm/all_sft_formats_balanced_human_only_20260222_1240_ep3_lr3e5_qwen3-vl-8b, is an 8 billion parameter language model. It is a fine-tuned variant of the Qwen/Qwen3-VL-8B-Instruct architecture, indicating its foundation in a robust base model known for its capabilities.

Key Training Details

The model underwent a specific fine-tuning process with the following hyperparameters:

Base Model: Qwen/Qwen3-VL-8B-Instruct
Learning Rate: 3e-05
Epochs: 6.0
Batch Size: A total training batch size of 128 (with train_batch_size 8 and gradient_accumulation_steps 2 across 8 devices).
Optimizer: AdamW with specific beta and epsilon values.
Scheduler: Cosine learning rate scheduler with a 0.05 warmup ratio.

Intended Use

While specific intended uses and limitations are not detailed in the provided README, its origin from an instruction-tuned Qwen3-VL-8B model suggests potential applications in various instruction-following tasks. The fine-tuning on a "balanced human-only" dataset implies an optimization for human-like conversational or instructional responses. Developers should evaluate its performance for their specific use cases, particularly those benefiting from a 32768 token context window.

Overview

Model Overview

Key Training Details

Intended Use

Full Model Card (README)