aasim-m/daft-qwen2.5-coder-3b-instruct-full-loss-0.02

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 14, 2026License:otherArchitecture:Transformer Cold

The aasim-m/daft-qwen2.5-coder-3b-instruct-full-loss-0.02 model is a 3.1 billion parameter instruction-tuned causal language model, fine-tuned from Qwen/Qwen2.5-Coder-3B-Instruct. It was specifically trained on the daft_functions_dedup_sharegpt dataset, indicating an optimization for code-related tasks, particularly function generation or understanding. With a context length of 32768 tokens, this model is designed for applications requiring robust code instruction following and generation.

Loading preview...

Model Overview

This model, aasim-m/daft-qwen2.5-coder-3b-instruct-full-loss-0.02, is a specialized instruction-tuned variant of the Qwen2.5-Coder-3B-Instruct architecture. It features 3.1 billion parameters and supports a substantial context length of 32768 tokens, making it suitable for processing longer code snippets or complex programming instructions.

Key Capabilities

  • Code Instruction Following: Fine-tuned on the daft_functions_dedup_sharegpt dataset, suggesting a strong capability in understanding and generating code based on instructions.
  • Code Generation: Optimized for tasks related to programming functions and code structures.
  • Extended Context: The 32K token context window allows for handling more extensive codebases or detailed problem descriptions.

Training Details

The model was trained with a learning rate of 0.0001, using an AdamW optimizer and a cosine learning rate scheduler. Training involved a total batch size of 64 over 3 epochs, leveraging multi-GPU distribution. This specific fine-tuning process aims to enhance its performance on code-centric tasks, differentiating it from general-purpose instruction models.

Intended Use Cases

This model is particularly well-suited for developers and researchers focused on:

  • Automated Code Generation: Creating functions or code blocks from natural language prompts.
  • Code Completion and Refactoring: Assisting with programming tasks within an IDE or development environment.
  • Educational Tools: Generating examples or explanations for programming concepts.

While specific performance metrics are not detailed, its specialized training on a code-focused dataset indicates a strong aptitude for programming-related applications.