amphora/Qwen3-4B-DASD-32K

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 14, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The amphora/Qwen3-4B-DASD-32K is a 4 billion parameter Qwen3-based causal language model developed by amphora, fine-tuned from unsloth/Qwen3-4B-Instruct-2507. This model features a 32K context length and was trained with Unsloth, indicating an optimization for faster training. It is designed for general language tasks, leveraging its Qwen3 architecture and efficient training methodology.

Loading preview...

Model Overview

The amphora/Qwen3-4B-DASD-32K is a 4 billion parameter language model built upon the Qwen3 architecture, developed by amphora. It is a fine-tuned version of the unsloth/Qwen3-4B-Instruct-2507 model, indicating a focus on instruction-following capabilities. A notable aspect of this model is its training process, which utilized Unsloth to achieve a 2x speedup in training time.

Key Capabilities

  • Qwen3 Architecture: Leverages the robust and capable Qwen3 base model for strong general language understanding and generation.
  • Instruction-Tuned: Fine-tuned from an instruction-following model, suggesting proficiency in responding to diverse prompts and tasks.
  • Efficient Training: Benefits from Unsloth's optimization, which allows for faster and potentially more cost-effective model development.
  • 32K Context Length: Supports processing and generating text over a substantial context window, enabling more complex and longer interactions.

Good For

  • Applications requiring a capable 4 billion parameter model with a large context window.
  • Instruction-following tasks where a fine-tuned Qwen3 variant is beneficial.
  • Developers interested in models trained with efficient methods like Unsloth.