cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese
Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 27, 2025License:mitArchitecture:Transformer0.3K Open Weights Warm

cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese is a 32 billion parameter Japanese-finetuned causal language model developed by CyberAgent, based on deepseek-ai/DeepSeek-R1-Distill-Qwen-32B. This model is optimized for Japanese language understanding and generation, leveraging a 32768 token context length for complex tasks. Its primary strength lies in providing high-quality responses in Japanese, making it suitable for applications requiring robust Japanese NLP capabilities.

Loading preview...

Model Overview

This model, cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese, is a 32 billion parameter language model specifically fine-tuned for the Japanese language. It is built upon the deepseek-ai/DeepSeek-R1-Distill-Qwen-32B architecture, inheriting its foundational capabilities while specializing in Japanese linguistic nuances.

Key Capabilities

  • Japanese Language Proficiency: Optimized for understanding and generating text in Japanese, making it highly effective for Japanese-centric applications.
  • Large Context Window: Features a substantial context length of 32768 tokens, enabling it to process and generate coherent responses for lengthy inputs and complex conversations.
  • Instruction Following: Designed to follow instructions effectively, as demonstrated by its chat template usage for conversational AI.

Use Cases

  • Japanese Chatbots and Virtual Assistants: Ideal for developing conversational agents that interact naturally in Japanese.
  • Content Generation: Suitable for creating various forms of Japanese text, including articles, summaries, and creative writing.
  • Language Understanding Tasks: Can be applied to tasks such as sentiment analysis, information extraction, and question answering in Japanese contexts.

This model is released under the MIT License, allowing for flexible use and distribution.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p