Name: AI-Sweden-Models/gpt-sw3-356m API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AI-Sweden-Models

Overview

AI-Sweden-Models/gpt-sw3-356m is a 356 million parameter model from the GPT-SW3 family, developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. It is a decoder-only transformer pretrained using a causal language modeling (CLM) objective with the NeMo Megatron GPT implementation. The model was trained on a substantial 320 billion token dataset, which includes content in Swedish, Norwegian, Danish, Icelandic, English, and various programming languages.

Key Capabilities

Multilingual Text Generation: Capable of generating coherent text in five distinct languages: Swedish, Norwegian, Danish, Icelandic, and English.
Code Generation: Supports text generation in four different programming languages.
Task Adaptability: Can perform various text-based tasks by framing them as text generation problems, even if not explicitly trained for them.

Intended Use and Limitations

GPT-SW3 models are shared in a controlled pre-release to facilitate validation and feedback from the Nordic NLP community. Like other large language models, GPT-SW3 has limitations, including potential biases, safety concerns, and issues with generation diversity and hallucination. It may overrepresent certain viewpoints, contain stereotypes, or generate inappropriate content. Users should be aware that the model can produce incorrect information or irrelevant outputs. The model is released under a modified RAIL license to promote transparency and further study of LLMs.

Overview

Overview

Key Capabilities

Intended Use and Limitations

Full Model Card (README)