yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step2304

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 6, 2026Architecture:Transformer Warm

The yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step2304 is a 4 billion parameter language model, likely based on the Qwen architecture, fine-tuned using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). This model is designed for general language generation tasks, leveraging its 32768 token context length for processing extensive inputs. Its specific differentiators and primary use cases are not detailed in the provided information.

Loading preview...

Model Overview

The yunjae-won/mpq3_qwen4bi_sft_dpo_beta1e-1_step2304 is a 4 billion parameter language model. While specific architectural details are not provided, the model name suggests a foundation in the Qwen series, enhanced through a training regimen involving Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). It supports a substantial context length of 32768 tokens, indicating its capability to handle and process lengthy textual inputs.

Key Capabilities

  • Large Context Window: With a 32768 token context length, the model can process and generate responses based on extensive input texts.
  • Fine-tuned Performance: The application of SFT and DPO suggests an optimization for instruction following and alignment with human preferences, aiming for improved response quality and relevance.

Good For

  • General language generation tasks requiring understanding of long contexts.
  • Applications where a balance between model size (4B parameters) and fine-tuned performance is desired.

Further details regarding specific use cases, training data, evaluation metrics, and potential biases are not available in the provided model card.