abhinand/gemma-2b-tamil

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Feb 25, 2024License:gemma-terms-of-useArchitecture:Transformer0.0K Warm

The abhinand/gemma-2b-tamil is a 2.6 billion parameter foundational language model, continually pretrained from Google Gemma 2B, specifically adapted for the Tamil language. Developed by Abhinand Balachandran, this experimental model focuses on causal language modeling in both English and Tamil, without expanding the original Gemma vocabulary. It demonstrates promising bilingual capabilities for a 2B parameter model, particularly for Tamil language generation and understanding.

Loading preview...

Model Overview

abhinand/gemma-2b-tamil is an experimental 2.6 billion parameter foundational language model, continually pretrained from Google's Gemma 2B. Developed by Abhinand Balachandran, this model aims to adapt Gemma for the Tamil language without expanding its original vocabulary, making it a bilingual model supporting both English and Tamil.

Key Capabilities & Training

  • Bilingual Support: Designed for causal language modeling in both English and Tamil.
  • Continual Pretraining: The Gemma base model was continually pretrained on all available Tamil Wikipedia data for 3 epochs.
  • Finetuning: Subsequently finetuned on a mix of English and Tamil Alpaca datasets for 5 epochs (the instruction-tuned version is available separately).
  • Experimental Nature: This is an alpha release, with potential for improved performance through extended pretraining on larger datasets like CulturaX.
  • Training Details: Trained in bfloat16 precision on 4x Nvidia RTX 3090 GPUs.

Performance & Limitations

While experimental, the model shows promise for a 2B parameter size. Evaluation results from the Open LLM Leaderboard indicate an average score of 45.13, with specific scores including HellaSwag (10-Shot) at 71.30 and MMLU (5-Shot) at 38.21. Users should note that the model has not undergone detoxification and may generate harmful or offensive content, requiring discretion and supervision.