mshojaei77/Gemma-2-2b-fa
mshojaei77/Gemma-2-2b-fa is an experimental 2.6 billion parameter model, fine-tuned from Google's Gemma-2-2b-it using QLoRA. It is specifically adapted for Persian language conversational tasks, leveraging the mshojaei77/Persian_sft dataset. This model is an early-stage proof-of-concept for research and experimentation in Persian AI, designed for text generation in conversational applications.
Loading preview...
Persian Gemma 2b: An Experimental Conversational AI
mshojaei77/Gemma-2-2b-fa is an early-stage experimental model derived from Google's Gemma-2-2b-it, fine-tuned using QLoRA for Persian language conversational tasks. With 2.6 billion parameters and an 8192-token context length, it inherits the efficient architecture of the base Gemma model.
Key Characteristics & Training:
- Base Model:
google/gemma-2-2b-it - Fine-tuning: QLoRA (Quantization-aware Low-Rank Adaptation) for parameter-efficient training.
- Dataset:
mshojaei77/Persian_sft, a collection of Persian conversations for instruction fine-tuning. - Language: Primarily Persian (fa).
- Critical Note: The model was trained for only 20 steps, making it a proof-of-concept with significantly under-optimized performance and no formal evaluation.
Intended Use Cases:
- Research & Experimentation: Investigating the feasibility of fine-tuning Gemma for Persian conversational AI.
- Educational Purposes: Demonstrating QLoRA fine-tuning techniques and Persian language model development.
- Prototyping (with caution): Exploring potential applications, acknowledging its preliminary state and limitations.
Limitations:
Due to severe under-training, the model exhibits:
- Sub-optimal Performance: Limited fluency, coherence, and prone to hallucinations.
- Bias: Likely inherits and amplifies biases from its base model and dataset.
- Poor Generalization: Performance degrades significantly outside the training distribution.
- No Formal Evaluation: Performance metrics are unavailable, and output quality is highly variable.