g-assismoraes/Qwen3-4B-it-pira-ep3-QA-qairm
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The g-assismoraes/Qwen3-4B-it-pira-ep3-QA-qairm model is a fine-tuned 4 billion parameter Qwen3-Instruct variant, developed by g-assismoraes. It is based on Qwen/Qwen3-4B-Instruct-2507 and was trained with a 32768 token context length. This model is specifically adapted for question-answering tasks, leveraging its instruction-tuned base for conversational interactions.

Loading preview...

Model Overview

This model, g-assismoraes/Qwen3-4B-it-pira-ep3-QA-qairm, is a fine-tuned version of the Qwen3-4B-Instruct-2507 base model, developed by Qwen. It features 4 billion parameters and supports a substantial context length of 32768 tokens, making it suitable for processing longer inputs and maintaining conversational coherence over extended interactions.

Key Training Details

The model underwent a fine-tuning process with specific hyperparameters:

  • Learning Rate: 0.0002
  • Optimizer: ADAMW_TORCH with default betas and epsilon
  • Batch Size: A total effective batch size of 32 (train_batch_size 4, gradient_accumulation_steps 8)
  • Epochs: 3
  • Scheduler: Cosine learning rate scheduler with a 0.05 warmup ratio

Intended Use Cases

Given its instruction-tuned foundation and fine-tuning, this model is primarily intended for Question Answering (QA) applications. Its large context window allows for detailed information retrieval and synthesis from longer documents or conversational histories to formulate accurate answers.