rizkysulaeman/Qwen3-VL-8B-Vision-GRPO-HealthCare

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 14, 2026License:mitArchitecture:Transformer Open Weights Cold

The rizkysulaeman/Qwen3-VL-8B-Vision-GRPO-HealthCare model is an 8 billion parameter multimodal large language model, based on the Qwen3-VL architecture, specifically fine-tuned for healthcare applications. This vision-capable model processes both text and images, offering a context length of 32768 tokens. It was fine-tuned and converted to GGUF format using Unsloth, optimizing it for efficient deployment and use in specialized healthcare contexts.

Loading preview...

Overview

The rizkysulaeman/Qwen3-VL-8B-Vision-GRPO-HealthCare is an 8 billion parameter multimodal large language model (LLM) built upon the Qwen3-VL architecture. This model is specifically fine-tuned for applications within the healthcare domain, enabling it to process and understand both textual and visual information relevant to medical contexts. It supports a substantial context length of 32768 tokens, allowing for comprehensive analysis of complex inputs.

Key Characteristics

  • Multimodal Capabilities: Designed to handle both text and image inputs, making it suitable for tasks requiring visual understanding in healthcare.
  • Healthcare Specialization: Fine-tuned for specific use cases within the healthcare sector, suggesting enhanced performance on related data.
  • Efficient Deployment: The model has been converted to the GGUF format, which is optimized for efficient inference on various hardware, including CPU-only setups, via tools like llama.cpp.
  • Unsloth Optimization: The fine-tuning and conversion process leveraged Unsloth, indicating potential benefits in training speed and resource efficiency.

Available Formats

Multiple GGUF quantized versions are provided for flexible deployment:

  • qwen3-vl-8b-instruct.Q5_K_M.gguf
  • qwen3-vl-8b-instruct.Q8_0.gguf
  • qwen3-vl-8b-instruct.Q4_K_M.gguf
  • qwen3-vl-8b-instruct.F16-mmproj.gguf

Usage

This model can be run using llama.cpp tools. For multimodal interactions, the llama-mtmd-cli is recommended, while llama-cli can be used for text-only operations.