Name: timpal0l/gpt-sw3-1.3b-instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: timpal0l

Model Overview

The timpal0l/gpt-sw3-1.3b-instruct is a 1.4 billion parameter instruction-tuned model from the GPT-SW3 series, developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. It is a decoder-only transformer pretrained on a substantial 320 billion token dataset. This dataset is unique for its extensive coverage of Nordic languages (Swedish, Norwegian, Danish, Icelandic) alongside English and programming code.

Key Capabilities

Multilingual Text Generation: Capable of generating coherent text in Swedish, Norwegian, Danish, Icelandic, and English.
Code Generation: Supports text generation in four programming languages.
Instruction Following: Fine-tuned on instruction data, enabling it to perform various text tasks when prompted, even those not explicitly trained for, by framing them as text generation tasks.
Nordic Language Focus: Specifically designed to address the need for large language models in Nordic languages.

Training Details

The model was pretrained using a causal language modeling (CLM) objective with the NeMo Megatron GPT implementation. The instruction-tuned variants, like this one, were further fine-tuned using both chat and raw text instruction formats. The training data includes diverse sources such as books, articles, code (from GitHub), conversational data (e.g., Reddit, Familjeliv), mathematical datasets, and extensive web crawls (Common Crawl, Wikipedia).

Limitations

Like other large language models, GPT-SW3 has limitations regarding bias, safety, generation diversity, and hallucination. It may overrepresent certain viewpoints, contain stereotypes, and generate inappropriate or incorrect content. Users should be aware of these potential issues and implement appropriate safeguards.

Overview

Model Overview

Key Capabilities

Training Details

Limitations

Full Model Card (README)