AJNG/qwen_3_nepali_ocr_merged_phase1

VISIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 6, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

AJNG/qwen_3_nepali_ocr_merged_phase1 is a 4 billion parameter Qwen3-VL model developed by AJNG, fine-tuned for specific tasks. This model leverages the Qwen3-VL architecture and was trained using Unsloth and Huggingface's TRL library for accelerated performance. It is designed for applications requiring a compact yet capable model, potentially excelling in areas related to its fine-tuning. The model has a context length of 32768 tokens, making it suitable for processing moderately long inputs.

Loading preview...

AJNG/qwen_3_nepali_ocr_merged_phase1 Overview

This model, developed by AJNG, is a fine-tuned variant of the Qwen3-VL-4B-Instruct architecture, featuring 4 billion parameters and a 32768-token context length. It was specifically trained using the Unsloth framework and Huggingface's TRL library, which enabled a 2x faster training process.

Key Characteristics

  • Base Model: Fine-tuned from unsloth/Qwen3-VL-4B-Instruct.
  • Training Efficiency: Utilizes Unsloth for accelerated training, resulting in a 2x speed improvement.
  • Parameter Count: A compact 4 billion parameters, balancing performance with resource efficiency.
  • Context Length: Supports a substantial 32768 tokens, allowing for processing of detailed inputs.

Potential Use Cases

Given its base as a Qwen3-VL model and its fine-tuned nature, this model is likely optimized for specific applications. Developers looking for a Qwen3-VL based model with efficient training and a reasonable context window for specialized tasks may find this model suitable.