Pinkstack/DistilGPT-OSS-qwen3-4B

Warm
Public
4B
BF16
40960
License: apache-2.0
Hugging Face
Overview

Overview

Pinkstack/DistilGPT-OSS-qwen3-4B is a 4 billion parameter model built upon the Qwen3 architecture, fine-tuned using GPT-OSS reasoning outputs. This fine-tuning approach differentiates it from the original Qwen3, which likely used Deepseek r1 outputs for advanced reasoning. The model supports an extensive total context of up to 262K tokens and can engage in reasoning for up to 65536 tokens when set to high effort.

Key Capabilities & Differentiators

  • Adjustable Reasoning Effort: Users can specify reasoning effort (low, medium, high) via the system prompt, allowing for tailored computational intensity based on task complexity.
  • Distinct Answer Style: The model produces responses with a style more similar to ChatGPT, and notably generates fewer emojis compared to the original Qwen3 4B.
  • Performance-Focused Fine-tuning: The fine-tuning process prioritized performance over strict censorship, resulting in a model that allows for more "creative" prompts while still being safety-trained.
  • Efficient Local Operation: Designed for efficient on-device assistance, making it suitable for local deployments.

Recommended Use Cases

  • Local on-device efficient assistance
  • Code generation
  • Math generation
  • Summary generation
  • General day-to-day use

Limitations

Due to its size and potential for hallucinations, the model is explicitly not recommended for law-related, medical, or any high-risk applications requiring 1:1 accuracy. Its general knowledge is limited by its parameter count.