cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 27, 2025License:mitArchitecture:Transformer0.3K Open Weights Warm

cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese is a 32 billion parameter Japanese-finetuned causal language model developed by CyberAgent, based on deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model is optimized for Japanese language understanding and generation, leveraging a 32768 token context length for complex tasks. Its primary strength lies in providing high-quality responses in Japanese, making it suitable for applications requiring robust Japanese NLP capabilities.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p