AI-Sweden-Models/gpt-sw3-40b
The AI-Sweden-Models/gpt-sw3-40b is a 39.9 billion parameter decoder-only transformer language model developed by AI Sweden in collaboration with RISE and WASP WARA. It was pretrained on 320 billion tokens across Swedish, Norwegian, Danish, Icelandic, English, and programming code, utilizing the NeMo Megatron GPT implementation. This model is designed for generating coherent text in multiple Nordic languages and English, and can perform various text generation tasks.
Loading preview...
Model Overview
AI Sweden's GPT-Sw3 40B is a large decoder-only transformer model, part of the GPT-Sw3 series, developed in collaboration with RISE and WASP WARA for Media and Language. It is pretrained on a substantial dataset of 320 billion tokens, encompassing Swedish, Norwegian, Danish, Icelandic, English, and programming code, using the NeMo Megatron GPT implementation.
Key Capabilities
- Multilingual Text Generation: Capable of generating coherent text in five languages: Swedish, Norwegian, Danish, Icelandic, and English.
- Code Generation: Supports text generation in four programming languages.
- Task Adaptability: Can perform various text tasks by framing them as text generation problems, even if not explicitly trained for them.
- Research Focus: Primarily intended for research and evaluation of large language models, particularly for Nordic languages.
Intended Use Cases
- Research and Development: Ideal for organizations and individuals within the Nordic NLP ecosystem to validate, test, and provide feedback on LLMs.
- Text Generation: Suitable for applications requiring text creation in its supported languages.
- Exploration of LLM Capabilities: Useful for studying the performance and limitations of large language models in a multilingual context.
Limitations
Like other large language models, GPT-Sw3 40B has limitations including potential biases, safety concerns, and quality issues such as hallucination and lack of generation diversity. It may produce stereotypes, hateful, abusive, or discriminatory language, and incorrect or irrelevant information. Users should be aware of these limitations and consider appropriate safeguards.