Overview
Indic-gemma-7b-finetuned-sft-Navarasa-2.0 Overview
This model, developed by Ravi Theja and Ramsri Goutham of Telugu-LLM-Labs, is an 8.5 billion parameter instruction-tuned variant of Google's Gemma-7b. It has undergone LoRA fine-tuning using approximately 650,000 instruction samples across 15 Indian languages and English, enhancing its ability to understand and generate text in a multilingual context.
Key Capabilities
- Multilingual Instruction Following: Specialized in responding to instructions in 15 Indian languages (Hindi, Telugu, Marathi, Urdu, Assamese, Konkani, Nepali, Sindhi, Tamil, Kannada, Malayalam, Gujarati, Punjabi, Bengali, Odia) and English.
- Gemma Architecture: Leverages the robust base of Google's Gemma-7b model.
- Efficient Fine-tuning: Utilizes the unsloth library for efficient training and faster inference.
- Context Length: Supports a context window of 8192 tokens.
Training Details
The model was trained for 45 hours on a single A100 GPU with 80GB memory, using datasets like samvaad-hi-filtered, telugu_alpaca_yahma_cleaned_filtered_romanized, and various other Alpaca-style datasets for Indic languages. Inference can be performed using either the unsloth library for optimized speed or standard HuggingFace transformers.