bharadwajswarna/Zephyr-Gemma-7B-Telugu

Cold
Public
8.5B
FP8
8192
License: mit
Hugging Face
Overview

Overview

bharadwajswarna/Zephyr-Gemma-7B-Telugu is a Supervised Fine-Tuned (SFT) model based on the HuggingFaceH4/zephyr-7b-gemma-v0.1 architecture. Developed by Bharadwaj Swarna, this model has been trained on Telugu Question & Answer datasets curated by Telugu LLM Labs, making it specialized for generating responses in the Telugu language.

Key Capabilities

  • Telugu Language Generation: Optimized for understanding and generating text in Telugu, particularly for Q&A formats.
  • SFT Training: Utilizes Supervised Fine-Tuning on a domain-specific dataset to enhance performance for Telugu tasks.
  • Gemma-based Architecture: Built upon the Zephyr-Gemma foundation, inheriting its underlying language model capabilities.

Limitations and Future Work

  • No DPO Alignment: Currently, the model is not aligned via DPO (Direct Preference Optimization) in Telugu. This is a work in progress, with dataset curation underway for future DPO training.

Good for

  • Telugu Q&A Systems: Ideal for applications requiring accurate and contextually relevant answers to questions posed in Telugu.
  • Telugu Content Generation: Useful for generating various forms of text content in Telugu, given its fine-tuning on Q&A data.
  • Research and Development: Provides a strong baseline for further research and development in Telugu natural language processing, especially for DPO alignment experiments.