NorMistral-11b-warm is a 12 billion parameter Norwegian language model developed by the Language Technology Group at the University of Oslo (LTG) as part of the NORA.LLM family. Initialized from Mistral-Nemo-Base-2407, it underwent continual pretraining on 250 billion subword tokens, including a mix of Scandinavian, Sámi, English, and code data. This model is specifically optimized for Norwegian and other Scandinavian languages, featuring a new tokenizer for faster inference and hybrid masked-causal training, making it suitable for both causal generative and bidirectional encoder tasks.
No reviews yet. Be the first to review!