Overview
This is an experimental 2.6 billion parameter instruction-tuned model, gemma-2b-it-tamil-v0.1-alpha, developed by Abhinand Balachandran. It is based on Google's Gemma 2B and is specifically adapted for bilingual English and Tamil language processing. The model was continually pretrained on Tamil Wikipedia data for 3 epochs, then fine-tuned on a mix of English and Tamil Alpaca datasets for 5 epochs.
Key Capabilities & Performance
- Bilingual Proficiency: Designed for both English and Tamil language understanding and generation.
- Efficient Adaptation: Achieves Tamil language adaptation without expanding the base model's vocabulary.
- Benchmark Outperformance: Surpasses Google's Gemma 2B base and instruct models, and even
mlabonne/Gemmalpaca-2B, across various benchmarks in the Nous evaluation suite, including AGIEval, GPT4All, TruthfulQA, and Bigbench. - Instruction Following: Fine-tuned with 100,000 samples for robust instruction-following capabilities.
Limitations
- Experimental Release: This is an alpha release and is still under development, with potential for further performance improvements through more extensive pretraining.
- Detoxification: The model has not undergone detoxification, meaning it may generate harmful or offensive content, requiring user discretion.
Use Cases
This model is suitable for applications requiring instruction-tuned language processing in both English and Tamil, particularly where a compact 2B parameter model is desired for efficiency.