g-assismoraes/Gemma3-4B-it-pira-ep3-QA-qairm-ptbr
Gemma3-4B-it-pira-ep3-QA-qairm-ptbr is a 4.3 billion parameter language model fine-tuned from Google's Gemma-3-4b-it. This model is specifically adapted for question-answering tasks, leveraging its base architecture for conversational interactions. It is optimized for specific QA applications, making it suitable for focused information retrieval and response generation.
Loading preview...
Model Overview
The g-assismoraes/Gemma3-4B-it-pira-ep3-QA-qairm-ptbr is a fine-tuned variant of the Google Gemma-3-4b-it model, featuring approximately 4.3 billion parameters. This iteration is specifically adapted for Question Answering (QA) tasks, building upon the robust capabilities of its base model.
Key Characteristics
- Base Model: Derived from
google/gemma-3-4b-it. - Parameter Count: 4.3 billion parameters.
- Context Length: Supports a context window of 32,768 tokens.
- Fine-tuning Focus: Optimized for question-answering, suggesting enhanced performance in extracting and generating relevant responses to queries.
Training Details
The model underwent training with the following hyperparameters:
- Learning Rate: 0.0002
- Batch Size: A
train_batch_sizeof 4 andeval_batch_sizeof 8, with agradient_accumulation_stepsof 8, resulting in atotal_train_batch_sizeof 32. - Optimizer: ADAMW_TORCH with default betas and epsilon.
- Scheduler: Cosine learning rate scheduler with a 0.05 warmup ratio.
- Epochs: Trained for 3 epochs.
Intended Use Cases
This model is primarily intended for applications requiring question-answering capabilities, particularly in contexts where the base Gemma-3-4b-it model's strengths can be leveraged for precise and relevant responses. Its fine-tuning suggests suitability for tasks like chatbots, information retrieval systems, and automated customer support where direct answers to user questions are paramount.