Name: AXCXEPT/EZO-Qwen2.5-72B-Instruct API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: AXCXEPT

EZO-Qwen2.5-72B-Instruct Overview

AXCXEPT/EZO-Qwen2.5-72B-Instruct is a 72.7 billion parameter model built upon the Qwen/Qwen2.5-72B-Instruct base. It features a substantial 131,072 token context length, enabling it to process extensive inputs and generate comprehensive responses. The model has been fine-tuned to significantly improve its overall performance, with a particular emphasis on Japanese language tasks.

Key Capabilities & Performance

Exceptional Japanese Language Performance: Achieved a score surpassing GPT-4-Turbo on the Japanese MT Bench when evaluated by GPT-4o, even with 4-bit quantization.
Multilingual Adaptability: Designed to address diverse global needs, demonstrating strong performance across various languages despite its Japanese focus.
Instruction Tuning: Utilizes a plain instruction tuning method with high-quality data extracted from Japanese Wikipedia and FineWeb, enhancing its ability to understand and generate high-quality responses.

Training & Data

The model's training involved creating instruction data from high-quality Japanese Wikipedia and FineWeb datasets. This innovative approach, including pre-instruction training, aims for performance improvements across different languages and domains, making it suitable for a wide range of global use cases.

When to Use This Model

Japanese Language Applications: Ideal for tasks requiring high proficiency in Japanese, such as content generation, translation, and complex query answering.
Multilingual Instruction Following: Suitable for applications that benefit from a model capable of understanding and generating responses in multiple languages, especially where robust instruction adherence is critical.
Research and Development: Positioned as an experimental prototype for research and development purposes, offering a powerful base for further exploration and fine-tuning.

Overview

EZO-Qwen2.5-72B-Instruct Overview

Key Capabilities & Performance

Training & Data

When to Use This Model

Full Model Card (README)