cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Jan 27, 2025License:mitArchitecture:Transformer0.1K Open Weights Warm
The cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese is a 14 billion parameter language model, fine-tuned for Japanese language tasks. It is based on the DeepSeek-R1-Distill-Qwen-14B architecture, leveraging its reasoning capabilities. This model is specifically optimized for generating Japanese text and understanding Japanese queries, making it suitable for applications requiring high-quality Japanese language processing.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–