cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 27, 2025License:mitArchitecture:Transformer0.3K Open Weights Warm
cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese is a 32 billion parameter Japanese-finetuned causal language model developed by CyberAgent, based on deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model is optimized for Japanese language understanding and generation, leveraging a 32768 token context length for complex tasks. Its primary strength lies in providing high-quality responses in Japanese, making it suitable for applications requiring robust Japanese NLP capabilities.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–