E27085921/HIKARI-Sirius-8B-SkinDx-RAG
HIKARI-Sirius-8B-SkinDx-RAG is an 8 billion parameter vision-language model developed by E27085921, fine-tuned from Qwen3-VL-8B-Thinking with a 32768 token context length. It specializes in 10-class skin disease diagnosis, achieving 85.86% accuracy on the SkinCAP dataset. Its key innovation is 'RAG-in-Training,' embedding retrieval-augmented generation directly into the fine-tuning process to enhance robustness against visual similarities between diseases.
Loading preview...
HIKARI-Sirius-8B-SkinDx-RAG: Advanced Skin Disease Diagnosis
HIKARI-Sirius is a specialized 8 billion parameter vision-language model, fine-tuned from Qwen/Qwen3-VL-8B-Thinking, designed for 10-class skin disease diagnosis. It is a fully merged model, meaning LoRA adapter weights are integrated directly, allowing for standard loading with transformers, vLLM, or SGLang without additional adapter loading.
Key Capabilities & Innovations
- RAG-in-Training: This model introduces a novel approach where retrieval-augmented generation is embedded during the fine-tuning process. This allows the model to learn by comparing query images against retrieved reference images and their clinical captions, significantly improving diagnostic robustness.
- High Accuracy: Achieves 85.86% validation accuracy on the SkinCAP Thai dermatology dataset for 10 distinct skin disease classes.
- Multimodal Diagnosis: Processes both image and text inputs to provide specific skin disease classifications.
- Optimized Performance: Demonstrates high throughput with
vLLM BnB-4bit(5.57 img/s) andSGLang FP8(9.11 img/s) on an RTX 5070 Ti.
Ideal Use Cases
- Automated Skin Disease Screening: Suitable for integration into healthcare pipelines for preliminary diagnosis of 10 common skin conditions.
- Medical Research: Can serve as a robust baseline or component in studies involving dermatological image analysis.
- Clinical Decision Support: Assists medical professionals by providing accurate, AI-driven diagnostic insights for skin lesions.
This model represents Stage 2 of the HIKARI pipeline, taking a grouped skin lesion image (e.g., 'inflammatory') and classifying it into a specific disease label.