The AI-Sweden-Models/gpt-sw3-20b-instruct is a 20.9 billion parameter decoder-only transformer language model developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. It was pretrained on 320 billion tokens across Swedish, Norwegian, Danish, Icelandic, English, and programming code, then fine-tuned on instruction data. This model is designed for generating coherent text in five languages and four programming languages, and can perform various text tasks through instruction-following.
Loading preview...
Overview
GPT-Sw3 is a family of large decoder-only transformer language models developed by AI Sweden in collaboration with RISE and the WASP WARA for Media and Language. This particular model, gpt-sw3-20b-instruct, is a 20.9 billion parameter variant that has been fine-tuned on instruction data using both chat and raw text formats.
Key Capabilities
- Multilingual Text Generation: Capable of generating coherent text in Swedish, Norwegian, Danish, Icelandic, and English.
- Multilingual Code Generation: Supports text generation in four programming languages.
- Instruction Following: Can be instructed to perform various text tasks, even those not explicitly trained for, by framing them as text generation tasks.
- Extensive Training Data: Pretrained on a diverse dataset of 320 billion tokens, including a significant portion of Nordic languages and programming code.
Intended Use Cases
- Research and Evaluation: Primarily intended for research and evaluation of Large Language Models, especially concerning their capabilities in Nordic languages.
- Text Generation: Suitable for generating human-like text across its supported languages.
- Instruction-based Tasks: Can be used for tasks requiring the model to follow specific instructions, such as question answering, summarization, or creative writing.
Limitations
Like other large language models, GPT-Sw3 has limitations regarding bias, safety, generation diversity, and hallucination. It may overrepresent certain viewpoints, contain stereotypes, or generate inappropriate content. Users should be aware of these potential issues and implement appropriate safeguards.