beachcities/qwen3-4b-sft-dpo-v2-structeval

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 7, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The beachcities/qwen3-4b-sft-dpo-v2-structeval is a 4 billion parameter Qwen3 model developed by beachcities, fine-tuned from unsloth/Qwen3-4B-Instruct-2507. This model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training speeds. It is designed for general language tasks, leveraging its efficient training methodology.

Loading preview...

Model Overview

The beachcities/qwen3-4b-sft-dpo-v2-structeval is a 4 billion parameter language model based on the Qwen3 architecture. Developed by beachcities, this model was fine-tuned from the unsloth/Qwen3-4B-Instruct-2507 base model.

Key Characteristics

  • Efficient Training: A notable feature of this model is its training methodology. It was trained 2x faster by utilizing Unsloth and Huggingface's TRL library, indicating an optimization for training efficiency.
  • Base Model: It builds upon the Qwen3-4B-Instruct foundation, suggesting capabilities for instruction-following tasks.
  • License: The model is released under the Apache-2.0 license, allowing for broad usage and distribution.

Potential Use Cases

Given its Qwen3 base and instruction-tuned origin, this model is suitable for a variety of general-purpose natural language processing tasks. Its efficient training process might make it an attractive option for developers looking for performant models with optimized resource consumption during development.