Shimin/qwen3_vl_8b_foreagent

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 15, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ForeAgent is a fine-tuned Qwen3-VL-8B model developed by Shimin, specifically designed for AI-generated image detection. This 8 billion parameter vision-language model analyzes images through multi-view forensic analysis, incorporating semantic, frequency-domain, and spatial-domain features. It excels at distinguishing real from fake (AI-generated) images, achieving 82.18% accuracy on the Chameleon benchmark, and outputs structured JSON with conclusion, confidence, and reasoning.

Loading preview...

Overview

ForeAgent (Forensics Agent) is a specialized 8 billion parameter vision-language model, fine-tuned from Qwen3-VL-8B by Shimin, for AI-generated image detection. It determines whether an image is authentic or AI-generated by performing a multi-view forensic analysis. The model processes both the original image and its frequency-domain representation (wavelet cD) for enhanced accuracy.

Key Capabilities

  • High Accuracy: Achieves 82.18% accuracy on the Chameleon benchmark, outperforming AIDE by 16.41%.
  • Multi-View Analysis: Integrates semantic features (texture, anatomy, consistency, artifacts), frequency-domain features (wavelet cD), and spatial-domain features (noise pattern residuals).
  • Structured Output: Provides a JSON output including a conclusion ("real" or "fake"), a confidence score (0.0-1.0), and a brief reasoning.
  • Iterative Self-Refinement: Trained using a Hindsight-Driven Self-Refining (EFA) pipeline involving iterative sampling, reflection, and evolution to improve reasoning quality and detection capabilities.
  • Dual-Input Mode: Supports optional dual-image input (original + wavelet frequency domain) for best performance.

Good For

  • AI-generated image detection and forensic analysis.
  • Deepfake detection in content moderation workflows.
  • Research into multimodal reasoning for image authenticity verification.
  • Integration into agentic forensic systems requiring detailed image analysis.