didula-wso2/Qwen3-8B_with_reasonningsft_16bit_vllm

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 21, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The didula-wso2/Qwen3-8B_with_reasonningsft_16bit_vllm is an 8 billion parameter Qwen3 model, fine-tuned by didula-wso2. This model was trained using Unsloth and Huggingface's TRL library, emphasizing faster training. With a 32,768 token context length, it is optimized for specific reasoning tasks, leveraging its efficient fine-tuning process.

Loading preview...

Model Overview

This model, developed by didula-wso2, is an 8 billion parameter variant of the Qwen3 architecture. It has been fine-tuned from the unsloth/qwen3-8b-unsloth-bnb-4bit base model, indicating a focus on efficient training and deployment.

Key Capabilities

  • Efficient Fine-tuning: Leverages Unsloth and Huggingface's TRL library for significantly faster training, reportedly 2x quicker.
  • Qwen3 Architecture: Built upon the Qwen3 foundation, providing a robust base for language understanding and generation.
  • Reasoning Focus: The model name suggests an optimization for reasoning tasks, making it suitable for applications requiring logical inference.
  • VLLM Compatibility: Designed for use with vLLM, indicating readiness for high-throughput inference.

Good For

  • Applications requiring a Qwen3-based model with an emphasis on reasoning capabilities.
  • Scenarios where efficient fine-tuning and deployment are critical.
  • Projects benefiting from a model with a 32,768 token context window for handling longer inputs.