Name: aasim-m/daft-qwen2.5-coder-3b-instruct-full-loss-0.02 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: aasim-m

Model Overview

This model, aasim-m/daft-qwen2.5-coder-3b-instruct-full-loss-0.02, is a specialized instruction-tuned variant of the Qwen2.5-Coder-3B-Instruct architecture. It features 3.1 billion parameters and supports a substantial context length of 32768 tokens, making it suitable for processing longer code snippets or complex programming instructions.

Key Capabilities

Code Instruction Following: Fine-tuned on the daft_functions_dedup_sharegpt dataset, suggesting a strong capability in understanding and generating code based on instructions.
Code Generation: Optimized for tasks related to programming functions and code structures.
Extended Context: The 32K token context window allows for handling more extensive codebases or detailed problem descriptions.

Training Details

The model was trained with a learning rate of 0.0001, using an AdamW optimizer and a cosine learning rate scheduler. Training involved a total batch size of 64 over 3 epochs, leveraging multi-GPU distribution. This specific fine-tuning process aims to enhance its performance on code-centric tasks, differentiating it from general-purpose instruction models.

Intended Use Cases

This model is particularly well-suited for developers and researchers focused on:

Automated Code Generation: Creating functions or code blocks from natural language prompts.
Code Completion and Refactoring: Assisting with programming tasks within an IDE or development environment.
Educational Tools: Generating examples or explanations for programming concepts.

While specific performance metrics are not detailed, its specialized training on a code-focused dataset indicates a strong aptitude for programming-related applications.

Overview

Model Overview

Key Capabilities

Training Details

Intended Use Cases

Full Model Card (README)