AliMaatouk/Llama-3.2-3B-Tele

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 16, 2025License:llama3.2Architecture:Transformer Warm

Llama-3.2-3B-Tele is a 3 billion parameter Transformer model developed by Ali Maatouk, based on Meta's Llama-3.2-3B. It is continuously pretrained on 2.5 billion tokens of telecommunications data, specializing it for telecommunications-related tasks. The model maintains performance on general benchmarks while outperforming its base model on telecommunications-specific evaluations. It features an 8192-token context length and is primarily intended as a base model for fine-tuning within the telecommunications domain.

Loading preview...

Overview

Llama-3.2-3B-Tele is a 3 billion parameter Transformer model, developed by Ali Maatouk, specifically adapted for the telecommunications domain. It is built upon Meta's Llama-3.2-3B and underwent continuous pretraining on a specialized dataset, Tele-Data, comprising approximately 2.5 billion tokens of telecommunications-related articles, standards, and web content.

Key Capabilities and Performance

  • Domain Specialization: Excels in telecommunications tasks due to extensive pretraining on relevant data.
  • Enhanced Telecommunications Performance: Outperforms the base Llama-3.2-3B model on telecommunications benchmarks like Tele-Eval by several percentage points.
  • General Performance Retention: Matches the performance of the original Llama-3.2-3B across common sense, language understanding, and logical reasoning benchmarks, indicating minimal compromise in general capabilities.
  • Context Length: Supports an 8192-token context window.

Usage and Recommendations

Llama-3.2-3B-Tele is a base model designed for text completion and is best suited for further fine-tuning on specific telecommunications applications. It is not instruction-tuned; for an instruction-following version, refer to Llama-3.2-3B-Tele-it.

Citation

For more details, refer to the paper: Tele-LLMs: A Series of Specialized Large Language Models for Telecommunications.