rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b
TEXT GENERATIONConcurrency Cost:2Model Size:32.8BQuant:FP8Ctx Length:32kPublished:Feb 10, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b model, developed by rinna, is a 32 billion parameter DeepSeek-R1 distilled variant of the Qwen2.5 Bakeneko architecture. Fine-tuned with Chat Vector and Odds Ratio Preference Optimization (ORPO), this model is specifically designed to deliver superior performance in Japanese language tasks. It adheres to the DeepSeek-R1 chat format, making it optimized for Japanese-centric reasoning and conversational applications.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p