Mushari440/qwen3-8B-sft-v3

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 17, 2026Architecture:Transformer Cold

Mushari440/qwen3-8B-sft-v3 is an 8 billion parameter causal language model developed by Mushari Alothman, fine-tuned from Qwen3-8B-Base. This supervised fine-tuned (SFT) model is optimized for accurate, clean supervision across both Arabic and English tasks. It excels at instruction following, context-based QA/RAG, and multiple-choice question answering in both languages, supporting a 32768 token context length.

Loading preview...

Model Overview

Mushari440/qwen3-8B-sft-v3 is an 8 billion parameter causal language model developed by Mushari Alothman. It is a supervised fine-tuned (SFT) version of Qwen3-8B-Base, specifically optimized for high-quality supervision in both Arabic and English.

Key Capabilities

This model is designed for a range of natural language processing tasks, particularly focusing on:

  • Bilingual Performance: Strong capabilities in both Arabic and English.
  • Question Answering: Proficient in multiple-choice question (MCQ) answering and context-based Question Answering (QA) / Retrieval-Augmented Generation (RAG).
  • Instruction Following: General instruction following across various prompts.

Training Details

The model underwent supervised fine-tuning (SFT) using bf16 mixed precision. The training data comprised curated Arabic and English datasets, including examples for MCQ, QA/RAG, context understanding, and general instruction following.

Intended Use Cases

This model is suitable for applications requiring robust language understanding and generation in Arabic and English, such as chatbots, content generation, and information retrieval systems where accurate responses to instructions and questions are critical.