timpal0l/gpt-sw3-126m-instruct
The timpal0l/gpt-sw3-126m-instruct is a 126 million parameter instruction-tuned decoder-only transformer language model developed by AI Sweden in collaboration with RISE and WASP WARA. It was fine-tuned from the GPT-Sw3 base model, trained on 320 billion tokens across Swedish, Norwegian, Danish, Icelandic, English, and programming code. This model is designed for generating coherent text and performing instruction-based text tasks in multiple Nordic languages and English.
Loading preview...
Model Overview
The timpal0l/gpt-sw3-126m-instruct is a 126 million parameter instruction-tuned model from the GPT-Sw3 family, developed by AI Sweden in collaboration with RISE and WASP WARA. It is a decoder-only transformer pretrained on a substantial 320 billion token dataset encompassing Swedish, Norwegian, Danish, Icelandic, English, and programming code. The instruction-tuned variant was fine-tuned using both chat and raw text instruction data, including datasets like Dolly, Open Assistant, OIG, and a Swedish pharmaceutical Q&A dataset (Fass).
Key Capabilities
- Multilingual Text Generation: Capable of generating coherent text in Swedish, Norwegian, Danish, Icelandic, and English.
- Multilingual Code Generation: Supports text generation in four programming languages.
- Instruction Following: Designed to perform various text tasks by interpreting instructions, even for tasks not explicitly trained for.
- Nordic Language Focus: Specifically trained with a significant portion of Nordic language data, making it suitable for applications requiring strong performance in these languages.
Intended Use Cases
This model is primarily intended for research and evaluation within the Nordic NLP ecosystem. It is suitable for generating text, responding to instructions, and exploring the capabilities of LLMs in Nordic languages. Users are encouraged to provide feedback for validation and testing.