Overview
Overview
bharadwajswarna/Zephyr-Gemma-7B-Telugu is a Supervised Fine-Tuned (SFT) model based on the HuggingFaceH4/zephyr-7b-gemma-v0.1 architecture. Developed by Bharadwaj Swarna, this model has been trained on Telugu Question & Answer datasets curated by Telugu LLM Labs, making it specialized for generating responses in the Telugu language.
Key Capabilities
- Telugu Language Generation: Optimized for understanding and generating text in Telugu, particularly for Q&A formats.
- SFT Training: Utilizes Supervised Fine-Tuning on a domain-specific dataset to enhance performance for Telugu tasks.
- Gemma-based Architecture: Built upon the Zephyr-Gemma foundation, inheriting its underlying language model capabilities.
Limitations and Future Work
- No DPO Alignment: Currently, the model is not aligned via DPO (Direct Preference Optimization) in Telugu. This is a work in progress, with dataset curation underway for future DPO training.
Good for
- Telugu Q&A Systems: Ideal for applications requiring accurate and contextually relevant answers to questions posed in Telugu.
- Telugu Content Generation: Useful for generating various forms of text content in Telugu, given its fine-tuning on Q&A data.
- Research and Development: Provides a strong baseline for further research and development in Telugu natural language processing, especially for DPO alignment experiments.