AI-Sweden-Models/gpt-sw3-126m
TEXT GENERATIONConcurrency Cost:1Model Size:0.2BQuant:BF16Ctx Length:2kPublished:Dec 14, 2022License:otherArchitecture:Transformer0.0K Loading
The GPT-Sw3 126M is a 0.2 billion parameter decoder-only transformer language model developed by AI Sweden in collaboration with RISE and WASP WARA for Media and Language. Trained on 320 billion tokens across Swedish, Norwegian, Danish, Icelandic, English, and programming code, it generates coherent text in multiple languages. This model is primarily intended for research and evaluation of LLM capabilities in Nordic languages, offering multilingual text generation and task instruction.
Loading preview...
Popular Sampler Settings
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.
temperature
–
top_p
–
top_k
–
frequency_penalty
–
presence_penalty
–
repetition_penalty
–
min_p
–