Desta 1B Question-Answering v4552 Rosa: Tigrinya QA Model
This model is a 1.1 billion parameter causal language model, specifically a LLaMA-style decoder-only architecture, developed by mewaeltsegay. It is a full fine-tuned variant of mewaeltsegay/desta_1b, optimized for Tigrinya question answering using the TiQuAD dataset.
Key Capabilities
- Tigrinya Question Answering: Excels at generating answers to QA-style prompts in Tigrinya.
- Full-Parameter Fine-tuning: Achieved through comprehensive fine-tuning, not LoRA adapters.
- Performance: Achieved validation EM of 42.3690 and F1 of 50.2434 on the TiQuAD dataset.
Intended Use Cases
- Tigrinya QA Research: Ideal for academic and research purposes in Tigrinya question answering.
- Low-Resource NLP Experimentation: Suitable for educational and experimental work in low-resource natural language processing.
- Baseline Model: Can serve as a foundational model for further domain adaptation and development.
Limitations and Recommendations
Users should be aware of potential hallucinations, inherited biases, and performance degradation on out-of-domain inputs. For factual applications, retrieval or source-grounding is recommended, along with human-in-the-loop verification for sensitive use cases. The model was trained for 10 epochs with a learning rate of 5e-5 and a max sequence length of 1024.