AliMaatouk/Llama-3.2-3B-Tele
Llama-3.2-3B-Tele is a 3 billion parameter Transformer model developed by Ali Maatouk, based on Meta's Llama-3.2-3B. It is continuously pretrained on 2.5 billion tokens of telecommunications data, specializing it for telecommunications-related tasks. The model maintains performance on general benchmarks while outperforming its base model on telecommunications-specific evaluations. It features an 8192-token context length and is primarily intended as a base model for fine-tuning within the telecommunications domain.
Loading preview...
Overview
Llama-3.2-3B-Tele is a 3 billion parameter Transformer model, developed by Ali Maatouk, specifically adapted for the telecommunications domain. It is built upon Meta's Llama-3.2-3B and underwent continuous pretraining on a specialized dataset, Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-related articles, standards, and web content.
Key Capabilities and Performance
- Domain Specialization: Excels in telecommunications tasks due to extensive pretraining on relevant data.
- Enhanced Telecommunications Performance: Outperforms the base Llama-3.2-3B model on telecommunications benchmarks like Tele-Eval by several percentage points.
- General Performance Retention: Matches the performance of the original Llama-3.2-3B across common sense, language understanding, and logical reasoning benchmarks, indicating minimal compromise in general capabilities.
- Context Length: Supports an 8192-token context window.
Usage and Recommendations
Llama-3.2-3B-Tele is a base model designed for text completion and is best suited for further fine-tuning on specific telecommunications applications. It is not instruction-tuned; for an instruction-following version, refer to Llama-3.2-3B-Tele-it.
Citation
For more details, refer to the paper: Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications.