abeja/ABEJA-Qwen2.5-7b-Japanese-v0.1

Warm
Public
7.6B
FP8
32768
1
Mar 12, 2025
License: apache-2.0
Hugging Face

ABEJA-Qwen2.5-7b-Japanese-v0.1 is a 7.6 billion parameter language model developed by ABEJA, based on Qwen/Qwen2.5-7B-Instruct. This model was trained using distillation from abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1, focusing on Japanese language capabilities. It enhances instruction-following performance through ChatVector, making it suitable for Japanese-centric conversational AI applications.

Overview

ABEJA-Qwen2.5-7b-Japanese-v0.1 Overview

ABEJA-Qwen2.5-7b-Japanese-v0.1 is a 7.6 billion parameter model developed by ABEJA, building upon the Qwen/Qwen2.5-7B-Instruct architecture. Unlike typical continued pre-training, this model leverages a distillation approach, learning from the larger abeja/ABEJA-Qwen2.5-32b-Japanese-v0.1 model.

Key Characteristics

  • Distillation Training: Utilizes knowledge distillation from a larger 32B parameter Japanese model, optimizing for efficiency while retaining strong performance.
  • Japanese Language Focus: Specifically trained for Japanese language tasks, making it highly relevant for applications requiring robust Japanese understanding and generation.
  • Enhanced Instruction Following: Improves instruction adherence through the application of ChatVector (the difference vector between Qwen/Qwen2.5-7B-Instruct and Qwen/Qwen2.5-7B), without additional Post-Training.

Use Cases

This model is particularly well-suited for:

  • Japanese Conversational AI: Developing chatbots and virtual assistants that interact effectively in Japanese.
  • Japanese Text Generation: Tasks requiring high-quality Japanese text output, such as content creation or summarization.
  • Applications requiring efficient Japanese LLMs: Its distilled nature suggests a balance between performance and computational resource usage for Japanese-specific workloads.