Name: jiayicheng/mix760_3step_bc760 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jiayicheng

Model Overview

This model, jiayicheng/mix760_3step_bc760, is an 8 billion parameter language model that has been fine-tuned from the base model Qwen/Qwen3-8B. The fine-tuning process utilized the sft_mixed760_official760 dataset.

Training Details

The model underwent 7 epochs of training with a learning rate of 4e-05. Key training hyperparameters included a train_batch_size of 1, gradient_accumulation_steps of 4, and an AdamW optimizer with specific beta and epsilon values. A cosine learning rate scheduler with a warmup ratio of 0.1 was employed. The training was distributed across 4 GPUs.

Capabilities & Use Cases

As a fine-tuned version of Qwen3-8B, this model is expected to inherit general language understanding and generation capabilities. However, the specific enhancements or specialized applications resulting from its fine-tuning on the sft_mixed760_official760 dataset are not detailed in the provided information. Users should consult further documentation for its intended uses and limitations.

Overview

Model Overview

Training Details

Capabilities & Use Cases

Full Model Card (README)