iCIIT/redqueenprotocol-sin-llama3.2-3B-model

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Aug 4, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

The iCIIT/redqueenprotocol-sin-llama3.2-3B-model is a 3.2 billion parameter instruction-tuned Llama 3.2 model developed by RedQueen Protocol. It specializes in generative Sinhala QA, achieved through a novel two-stage fine-tuning process. This model first gained a comprehensive grasp of Sinhala language via domain adaptation on Sinhala Wikipedia, then underwent sequential task adaptation on multiple Sinhala QA datasets using LoRA. It is optimized for accurate question answering in Sinhala.

Loading preview...

RedQueen Llama 3.2 3B - Sinhala Generative QA

This 3.2 billion parameter model, developed by RedQueen Protocol (Ramiru De Silva and Senadhi Thimanya), is an instruction-tuned Llama 3.2 variant specifically designed for generative Question Answering in Sinhala. It was created for the iCIIT Conclave 2025 Shared Task on Building Compact Sinhala & Tamil LLMs.

Key Capabilities & Training

The model's proficiency stems from a novel two-stage fine-tuning process utilizing Low-Rank Adaptation (LoRA):

  • Stage 1: Domain Adaptation (Language Foundation): The base Llama-3.2-3B-IT model was fine-tuned on the entirety of the Sinhala Wikipedia. This stage established a strong linguistic foundation and comprehensive understanding of the Sinhala language.
  • Stage 2: Task Adaptation (Sequential QA Fine-tuning): Building on the Wikipedia-tuned model, a single LoRA adapter was sequentially fine-tuned across three distinct Sinhala QA datasets:
    • A custom dataset of 528 Sinhala QA pairs.
    • 10,000 samples from the ihalage/sinhala-finetune-qa-eli5 dataset.
    • 13,500 samples from the janani-rane/SiQuAD dataset, formatted for context-question-answer tasks.

This hierarchical training strategy ensures the model first masters the language and then specializes in generative QA, making it highly effective for Sinhala-specific question-answering tasks.

How to Use

The model can be loaded and used with its corresponding LoRA adapter for text generation tasks, as demonstrated in the provided Python code snippet, which includes instructions for both Kaggle and Colab environments.