Name: abhinand/gemma-2b-it-tamil-v0.1-alpha API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: abhinand

Overview

This is an experimental 2.6 billion parameter instruction-tuned model, gemma-2b-it-tamil-v0.1-alpha, developed by Abhinand Balachandran. It is based on Google's Gemma 2B and is specifically adapted for bilingual English and Tamil language processing. The model was continually pretrained on Tamil Wikipedia data for 3 epochs, then fine-tuned on a mix of English and Tamil Alpaca datasets for 5 epochs.

Key Capabilities & Performance

Bilingual Proficiency: Designed for both English and Tamil language understanding and generation.
Efficient Adaptation: Achieves Tamil language adaptation without expanding the base model's vocabulary.
Benchmark Outperformance: Surpasses Google's Gemma 2B base and instruct models, and even mlabonne/Gemmalpaca-2B, across various benchmarks in the Nous evaluation suite, including AGIEval, GPT4All, TruthfulQA, and Bigbench.
Instruction Following: Fine-tuned with 100,000 samples for robust instruction-following capabilities.

Limitations

Experimental Release: This is an alpha release and is still under development, with potential for further performance improvements through more extensive pretraining.
Detoxification: The model has not undergone detoxification, meaning it may generate harmful or offensive content, requiring user discretion.

Use Cases

This model is suitable for applications requiring instruction-tuned language processing in both English and Tamil, particularly where a compact 2B parameter model is desired for efficiency.

Overview

Overview

Key Capabilities & Performance

Limitations

Use Cases

Full Model Card (README)