uparupa8810/competition-dpo
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 13, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
The uparupa8810/competition-dpo model is a Qwen3-based causal language model, fine-tuned by uparupa8810. It was trained using Unsloth and Huggingface's TRL library, indicating an optimization for efficient fine-tuning processes. This model is designed for general language generation tasks, leveraging the Qwen3 architecture for its capabilities.
Loading preview...