charlie-li/Qwen3-4B-Instruct-2507-ScaleSWE-Distilled-Epoch2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 7, 2026Architecture:Transformer Warm

The charlie-li/Qwen3-4B-Instruct-2507-ScaleSWE-Distilled-Epoch2 is a 4 billion parameter instruction-tuned language model, derived from Qwen/Qwen3-4B-Instruct-2507. This model has been further fine-tuned using the ScaleSWE-Distilled trajectories for two epochs, indicating a specialization in tasks related to software engineering or similar domains. With a 32768 token context length, it is designed for applications requiring robust instruction following and extended conversational or code-related interactions.

Loading preview...

Model Overview

This model, charlie-li/Qwen3-4B-Instruct-2507-ScaleSWE-Distilled-Epoch2, is a 4 billion parameter instruction-tuned language model. It is built upon the base of Qwen/Qwen3-4B-Instruct-2507, a model from the Qwen family known for its strong general-purpose capabilities.

Key Characteristics

  • Base Model: Derived from Qwen/Qwen3-4B-Instruct-2507.
  • Fine-tuning: Underwent an additional two epochs of supervised fine-tuning (SFT) using the ScaleSWE-Distilled trajectories. This specific fine-tuning dataset suggests an optimization for tasks related to software engineering (SWE), potentially including code generation, debugging, or technical problem-solving.
  • Parameter Count: Features 4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling it to handle longer inputs and maintain coherence over extended interactions.

Potential Use Cases

Given its specialized fine-tuning on ScaleSWE-Distilled trajectories, this model is likely well-suited for:

  • Software Engineering Tasks: Assisting with code generation, understanding, and debugging.
  • Technical Instruction Following: Executing complex technical commands or multi-step instructions.
  • Extended Technical Conversations: Maintaining context and providing relevant responses in long technical discussions.