Parallel-R1/Parallel-R1-Unseen_Step_200
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Aug 29, 2025License:mitArchitecture:Transformer Open Weights Warm
Parallel-R1-Unseen_Step_200 is a 4 billion parameter checkpoint from the Parallel-R1 model, developed by Parallel-R1, with a context length of 40960 tokens. This specific checkpoint represents an intermediate stage after 200 reinforcement learning steps, focusing on adaptive parallel reasoning and structural exploration. It is primarily intended for reproducing experimental results related to parallel thinking as a mid-training exploration strategy in RL.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–