Name: aasim-m/daft-qwen2.5-coder-3b-instruct-full API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aasim-m

Model Overview

The aasim-m/daft-qwen2.5-coder-3b-instruct-full is a 3.1 billion parameter instruction-tuned model, building upon the base architecture of Qwen/Qwen2.5-Coder-3B-Instruct. It has been specifically fine-tuned on the daft_functions_dedup_sharegpt dataset, indicating a strong specialization in code-related tasks, particularly function generation and understanding.

Key Capabilities

Code Generation: Optimized for generating code, likely focusing on Python functions given its training data.
Instruction Following: Designed to follow instructions for coding tasks.
Context Handling: Features a substantial context length of 32768 tokens, allowing it to process larger code snippets or conversational turns related to programming.

Training Details

The model was trained with a learning rate of 1e-05, a total batch size of 512 (across 4 GPUs with gradient accumulation), and utilized the AdamW_Torch_Fused optimizer. The training ran for 3 epochs with a cosine learning rate scheduler.

Intended Use Cases

This model is particularly well-suited for:

Code completion and generation: Assisting developers in writing code, especially functions.
Code explanation: Understanding and explaining existing code segments.
Educational tools: Providing code examples or solutions based on prompts.

Due to its specialized fine-tuning, it is expected to perform best on tasks directly related to code generation and comprehension, leveraging its Qwen2.5-Coder base and specific dataset training.

Overview

Model Overview

Key Capabilities

Training Details

Intended Use Cases

Full Model Card (README)