OTel-LLM-20B-IT: A Specialized Telecom Language Model

OTel-LLM-20B-IT is a 20 billion parameter language model, fine-tuned from the openai/gpt-oss-20b base model. It is a core component of the OTel Family of Models, an open-source initiative by Farbod Tavakkoli aimed at developing industry-standard AI for the global telecommunications sector. The model was trained using full parameter fine-tuning on a comprehensive dataset curated by over 100 domain experts, including data from arXiv telecom papers, 3GPP standards, GSMA documents, IETF RFCs, and O-RAN specifications.

Key Capabilities

Telecom Domain Specialization: Highly proficient in understanding and generating content related to telecommunications, leveraging a unique dataset from various institutional partners.
RAG Pipeline Integration: Designed to function as the generative component within a Retrieval-Augmented Generation (RAG) pipeline, working alongside OTel Embedding and Reranker models.
Context-Grounded Generation: Optimized for generating accurate responses based on provided context, with abstention training to decline answers when insufficient context is available, preventing hallucination.

Intended Use Cases

Answering technical queries within the telecommunications domain.
Summarizing and extracting information from telecom specifications, standards, and documentation.
Supporting knowledge retrieval systems for telecom professionals and researchers.
Developing specialized AI applications for the global telecommunications industry.

Overview

OTel-LLM-20B-IT: A Specialized Telecom Language Model

Key Capabilities

Intended Use Cases

Full Model Card (README)