beachcities/qwen3-4b-sft-dpo-v25mix-structeval

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 8, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The beachcities/qwen3-4b-sft-dpo-v25mix-structeval is a 4 billion parameter Qwen3-based language model developed by beachcities. Fine-tuned from unsloth/Qwen3-4B-Instruct-2507, this model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training. It features a substantial 40960 token context length, making it suitable for applications requiring extensive contextual understanding and generation.

Loading preview...

Overview

The beachcities/qwen3-4b-sft-dpo-v25mix-structeval is a 4 billion parameter language model developed by beachcities. It is a fine-tuned variant of the unsloth/Qwen3-4B-Instruct-2507 base model, leveraging the Qwen3 architecture. A key characteristic of this model is its training methodology, which utilized Unsloth and Huggingface's TRL library, resulting in a reported 2x acceleration in training speed. This model is designed for tasks benefiting from its 40960 token context window.

Key Capabilities

  • Efficient Training: Benefits from Unsloth's optimizations for faster fine-tuning.
  • Extended Context: Supports a 40960 token context length, enabling processing of longer inputs and generating more coherent, extended outputs.
  • Qwen3 Architecture: Built upon the robust Qwen3 foundation, providing strong general language understanding and generation abilities.

Good for

  • Applications requiring a compact yet capable language model with a very large context window.
  • Scenarios where efficient fine-tuning is a priority.
  • Tasks involving long-form text analysis, summarization, or generation.