Overview
Gemma-2B-Tele is a 2.6 billion parameter Transformer model developed by Ali Maatouk, based on Google's Gemma-2B architecture. It has been continuously pretrained on a specialized dataset called Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-related material, including industry articles, standards, and general web content. The model supports a context length of 8192 tokens.
Key Capabilities
- Telecommunications Specialization: Optimized for understanding and generating content within the telecommunications domain.
- Enhanced Domain Performance: Outperforms the original Gemma-2B model on telecommunications-specific benchmarks like Tele-Eval.
- Retained General Performance: Maintains performance levels comparable to the base Gemma-2B across common sense, language understanding, and logical reasoning benchmarks.
- Base Model: Designed as a base model for further fine-tuning on telecommunications applications, operating primarily within a text completion framework.
Good For
- Fine-tuning: Ideal for developers looking to fine-tune a model for specific telecommunications tasks.
- Domain-Specific Text Completion: Generating relevant and accurate text completions for telecommunications-related prompts.
- Research: Exploring specialized language models for niche technical domains.
An instruction-tuned version of this model, Gemma-2B-Tele-it, is also available for instruction-following tasks.