Mushari440/Qwen3-8B-SFT-chatml

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 5, 2026Architecture:Transformer Warm

Mushari440/Qwen3-8B-SFT-chatml is an 8 billion parameter causal language model developed by Mushari Alothman, fine-tuned from Qwen3-8B-Base. This supervised fine-tuned (SFT) model is optimized for accurate, clean supervision across both Arabic and English tasks. It excels in use cases such as MCQ answering, context-based QA/RAG, and general instruction following in both languages. The model supports a context length of 32768 tokens.

Loading preview...

Model Overview

Mushari440/Qwen3-8B-SFT-chatml is an 8 billion parameter causal language model developed by Mushari Alothman. It is a supervised fine-tuned (SFT) version of the Qwen3-8B-Base model, specifically optimized for high-quality performance in both Arabic and English language tasks. The model operates with bf16 mixed precision during training.

Key Capabilities

This model is designed for a range of direct applications, including:

  • Multilingual Question Answering: Proficient in answering Multiple Choice Questions (MCQ) in both Arabic and English.
  • Context-Based QA/RAG: Capable of performing Question Answering and Retrieval-Augmented Generation (RAG) tasks that require understanding and processing contextual information.
  • General Instruction Following: Excels at adhering to and executing general instructions provided in natural language.

Training Details

The model underwent Supervised Fine-Tuning (SFT) using curated datasets that encompass a variety of tasks in both Arabic and English. These datasets specifically included examples for MCQ, QA/RAG, context understanding, and general instruction following, ensuring a broad and robust training foundation. The model is licensed under Apache 2.0.

Intended Use Cases

This model is well-suited for applications requiring accurate and clean responses in Arabic and English. However, it is explicitly noted as out-of-scope for safety-critical or real-time decision-making, and for generating factual guarantees without external verification.