yunjae-won/mpq3_llama8b_sft_dpo_beta1e-1_step1536

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 6, 2026Architecture:Transformer Cold

The yunjae-won/mpq3_llama8b_sft_dpo_beta1e-1_step1536 is an 8 billion parameter language model with an 8192 token context length. This model is a fine-tuned variant, likely based on the Llama architecture, optimized through Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO). Its specific differentiators and primary use cases are not detailed in the provided model card, which indicates "More Information Needed" across most sections.

Loading preview...

Model Overview

This model, yunjae-won/mpq3_llama8b_sft_dpo_beta1e-1_step1536, is an 8 billion parameter language model with an 8192 token context length. While the specific base model and training details are not provided in the current model card, the naming convention suggests it has undergone Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) with a beta value of 1e-1, trained up to step 1536.

Key Characteristics

  • Parameters: 8 billion
  • Context Length: 8192 tokens
  • Optimization: Likely fine-tuned using Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO).

Current Limitations

The provided model card indicates that significant information regarding the model's development, funding, specific type, language(s), license, finetuning origin, intended uses, biases, risks, limitations, training data, training procedure, and evaluation results is currently marked as "More Information Needed." Users should be aware that without this critical information, understanding the model's full capabilities, appropriate applications, and potential risks is challenging. Further details are required for comprehensive recommendations and responsible deployment.