Name: ahmedheakl/cass-sm4090-3b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ahmedheakl

Overview

This model, ahmedheakl/cass-sm4090-3b, is a specialized fine-tuned version of the Qwen/Qwen2.5-Coder-3B-Instruct base model. It features 3.1 billion parameters and a context length of 32768 tokens, making it suitable for handling moderately long code sequences and instructions.

Key Capabilities

Code-focused Instruction Following: Fine-tuned from a Coder model, it is optimized for understanding and generating code based on instructions.
Specialized Training Data: The model was trained on cuda_amd_61k_4090_p1 and cuda_amd_61k_4090_p2 datasets, indicating a focus on tasks related to CUDA and AMD GPU environments.
Efficient Performance: As a 3.1B parameter model, it offers a balance between performance and computational efficiency, making it practical for deployment in resource-constrained environments.

Training Details

The fine-tuning process involved a learning rate of 2e-05, a total batch size of 128, and 3 epochs. It utilized a cosine learning rate scheduler with a 0.1 warmup ratio. The training was distributed across 4 GPUs using PyTorch 2.6.0+cu124 and Transformers 4.51.3.

Overview

Key Capabilities

Training Details

Useful Resources