Model Overview
Gemma-2-2B-Tele is a 2.6 billion parameter Transformer model, an adaptation of Google's Gemma-2-2B, developed by Ali Maatouk. Its primary specialization is in the telecommunications domain, achieved through continuous pretraining on a large-scale dataset called Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-specific content.
Key Capabilities and Performance
- Telecommunications Specialization: The model significantly outperforms the original Gemma-2-2B on telecommunications benchmarks such as Tele-Eval.
- General Performance Retention: Despite its domain specialization, Gemma-2-2B-Tele maintains performance comparable to the original Gemma-2-2B across benchmarks for common sense, language understanding, and logical reasoning.
- Context Length: It supports a context length of 8192 tokens.
- Base Model: This is a base model, meaning it is not instruction-tuned and operates as a text completion engine. An instruction-tuned version, Gemma-2-2B-Tele-it, is also available.
Ideal Use Cases
- Fine-tuning: Best suited as a foundation for further fine-tuning on specific telecommunications applications.
- Domain-Specific Text Completion: Effective for generating text completions within the telecommunications field, as demonstrated by its ability to complete prompts like "Shannon capacity is" with accurate, domain-relevant information.