BytedTsinghua-SIA/DAPO-Qwen-32B
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Apr 9, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

DAPO-Qwen-32B is a 32.8 billion parameter language model developed by BytedTsinghua-SIA, based on the Qwen2.5-32B architecture. It is trained using the DAPO (Deep Alignment with Preference Optimization) algorithm, specializing in mathematical problem-solving. With a context length of 131072 tokens, this model is optimized for complex reasoning tasks requiring step-by-step mathematical solutions.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p