Overview

Llama-3.2-3B-Tele is a 3 billion parameter Transformer model, developed by Ali Maatouk, specifically adapted for the telecommunications domain. It is built upon Meta's Llama-3.2-3B and underwent continuous pretraining on a specialized dataset, Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-related articles, standards, and web content.

Key Capabilities and Performance

Domain Specialization: Excels in telecommunications tasks due to extensive pretraining on relevant data.
Enhanced Telecommunications Performance: Outperforms the base Llama-3.2-3B model on telecommunications benchmarks like Tele-Eval by several percentage points.
General Performance Retention: Matches the performance of the original Llama-3.2-3B across common sense, language understanding, and logical reasoning benchmarks, indicating minimal compromise in general capabilities.
Context Length: Supports an 8192-token context window.

Usage and Recommendations

Llama-3.2-3B-Tele is a base model designed for text completion and is best suited for further fine-tuning on specific telecommunications applications. It is not instruction-tuned; for an instruction-following version, refer to Llama-3.2-3B-Tele-it.

Citation

For more details, refer to the paper: Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications.

Overview

Overview

Key Capabilities and Performance

Usage and Recommendations

Citation

Full Model Card (README)