cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Jan 27, 2025License:mitArchitecture:Transformer0.1K Open Weights Warm

The cyberagent/DeepSeek-R1-Distill-Qwen-14B-Japanese is a 14 billion parameter language model, fine-tuned for Japanese language tasks. It is based on the DeepSeek-R1-Distill-Qwen-14B architecture, leveraging its reasoning capabilities. This model is specifically optimized for generating Japanese text and understanding Japanese queries, making it suitable for applications requiring high-quality Japanese language processing.

Loading preview...

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p