UltraThinker-Coder-3B: A Code-Optimized Language Model

UltraThinker-Coder-3B is a 3.1 billion parameter language model developed by Malik Ayaan Ahmed, specifically fine-tuned for coding applications. It is built upon the unsloth/Qwen2.5-Coder-3B-bnb-4bit base model and further trained using the TRL (Transformer Reinforcement Learning) library.

Key Capabilities

Code Generation: Optimized for generating code, leveraging its foundation on a coder-specific base model.
Instruction Following: Fine-tuned with SFT (Supervised Fine-Tuning) to better understand and respond to user instructions.
Efficient Performance: As a 3.1B parameter model, it offers a balance between performance and computational efficiency.

Training Details

The model was trained using Supervised Fine-Tuning (SFT) with TRL version 0.24.0, Transformers 5.5.0, Pytorch 2.10.0, Datasets 4.3.0, and Tokenizers 0.22.2.

Good For

Developers looking for a compact yet capable model for code-related tasks.
Applications requiring instruction-tuned code generation.
Experimentation with fine-tuned Qwen2.5-based models.

Overview

UltraThinker-Coder-3B: A Code-Optimized Language Model

Key Capabilities

Training Details

Good For

Full Model Card (README)