Name: March07/Qwen2-5-Coder-32B-sft-kimi-800 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: March07

Overview

March07/Qwen2-5-Coder-32B-sft-kimi-800 is a 32.8 billion parameter language model derived from the Qwen2.5-Coder-32B-Instruct architecture. It has been specifically fine-tuned using the kimi_800 dataset over 10 epochs, with a learning rate of 1e-05. This supervised fine-tuning process aims to enhance its capabilities in code-related applications.

Key Training Details

Base Model: Qwen/Qwen2.5-Coder-32B-Instruct
Fine-tuning Dataset: kimi_800
Parameters: 32.8 billion
Context Length: 32768 tokens
Learning Rate: 1e-05
Optimizer: AdamW with fused tensors (betas=(0.9, 0.999), epsilon=1e-08)
Epochs: 10
Batch Size: 1 (train), 8 (eval) with 4 gradient accumulation steps, resulting in a total effective batch size of 32.

Intended Use

This model is primarily intended for applications requiring advanced code understanding and generation, leveraging its specialized fine-tuning on a code-centric dataset. Its large context window makes it suitable for handling complex and lengthy codebases.

Overview

Overview

Key Training Details

Intended Use

Full Model Card (README)