Name: norallm/normistral-11b-long API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: norallm

NorMistral-11b-long: Extended Context for Scandinavian Languages

NorMistral-11b-long is an 11.4 billion parameter causal language model developed by the Language Technology Group at the University of Oslo (LTG) within the NORA.LLM initiative. It is a length-extended version of NorMistral-11b-warm, designed with a significantly increased context window of 32,768 tokens.

Key Capabilities & Features

Extended Context: Achieves a 32,768 token context length through continual training on an additional 50 billion subword tokens.
Multilingual Focus: Training data includes a mix of Scandinavian (Norwegian Bokmål, Nynorsk, Danish, Swedish, Icelandic, Faroese), Sámi, English, and programming code.
Optimized Tokenizer: Utilizes a new, specially trained tokenizer for target languages, resulting in substantially faster inference compared to the base Mistral-Nemo-Base-2407 model, with improved subword-to-word split ratios.
Architecture: Based on the Mistral architecture, featuring pre-normalization with RMSNorm, SwiGLU activation, Rotary positional embeddings, and Grouped-query attention.
Research-Oriented: Primarily intended for research purposes, particularly in the domain of low-resource and Scandinavian language processing.

Good For

Research in Scandinavian NLP: Ideal for academic and research applications focusing on Norwegian, Sámi, and other Nordic languages.
Long-Context Tasks: Suitable for tasks requiring processing extensive textual inputs, thanks to its 32,768 token context window.
Continual Training Studies: A practical example of continual training for language extension, following the methodology outlined in the paper "Small Languages, Big Models: A Study of Continual Training on Languages of Norway."
Efficient Inference: Benefits from a custom tokenizer that enhances inference speed for its target languages.

Overview

NorMistral-11b-long: Extended Context for Scandinavian Languages

Key Capabilities & Features

Good For

Full Model Card (README)