AliMaatouk/Gemma-2-2B-Tele

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kLicense:gemmaArchitecture:Transformer Warm

Gemma-2-2B-Tele is a 2.6 billion parameter Transformer model developed by Ali Maatouk, specialized for telecommunications. Based on Google's Gemma-2-2B, it was continuously pretrained on 2.5 billion tokens of telecommunications data, achieving superior performance on telecommunications benchmarks like Tele-Eval while maintaining general language understanding. This base model, with an 8192-token context length, is designed for fine-tuning on telecommunications-related applications.

Loading preview...

Model Overview

Gemma-2-2B-Tele is a 2.6 billion parameter Transformer model, an adaptation of Google's Gemma-2-2B, developed by Ali Maatouk. Its primary specialization is in the telecommunications domain, achieved through continuous pretraining on a large-scale dataset called Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-specific content.

Key Capabilities and Performance

  • Telecommunications Specialization: The model significantly outperforms the original Gemma-2-2B on telecommunications benchmarks such as Tele-Eval.
  • General Performance Retention: Despite its domain specialization, Gemma-2-2B-Tele maintains performance comparable to the original Gemma-2-2B across benchmarks for common sense, language understanding, and logical reasoning.
  • Context Length: It supports a context length of 8192 tokens.
  • Base Model: This is a base model, meaning it is not instruction-tuned and operates as a text completion engine. An instruction-tuned version, Gemma-2-2B-Tele-it, is also available.

Ideal Use Cases

  • Fine-tuning: Best suited as a foundation for further fine-tuning on specific telecommunications applications.
  • Domain-Specific Text Completion: Effective for generating text completions within the telecommunications field, as demonstrated by its ability to complete prompts like "Shannon capacity is" with accurate, domain-relevant information.