four-two-labs/lynx-micro
Lynx-micro is a 2.6 billion parameter autoregressive transformer model developed by 42 Labs, fine-tuned from Google DeepMind's Gemma 2B. This small model is optimized for Swedish and English language tasks, demonstrating strong performance on the Scandeval Swedish NLG benchmark, scoring just below GPT-3.5 Turbo. It is particularly capable for its size, making it suitable for applications requiring efficient, high-quality language processing in Swedish.
Loading preview...
Lynx-micro: A Compact Swedish-English LLM
Lynx-micro is the inaugural release in 42 Labs' "Lynx" series of Swedish large language models. This 2.6 billion parameter autoregressive transformer is a fine-tune of Google DeepMind's Gemma 2B, designed for both Swedish and English language processing.
Key Capabilities & Performance
- Strong Swedish NLG Performance: Lynx-micro scores just below GPT-3.5 Turbo on the Scandeval Swedish NLG benchmark, outperforming many larger models in its category.
- Efficient for its Size: Despite its small parameter count, it delivers performance that "punches above its weight," making it a capable option where resource efficiency is important.
- Multilingual Support: Supports both Swedish and English, with training on high-quality Swedish instruct data (single and multi-turn) and Swedish-English translations.
Training Details
The model was trained on a proprietary dataset of approximately 1.35 million examples. For efficiency, all examples were packed into 8K context windows, reducing the original example count by 88%. The training utilized Hugging Face Accelerate and TRL.
Ideal Use Cases
- Swedish Language Applications: Excellent for tasks requiring high-quality Swedish text generation, translation, and understanding.
- Resource-Constrained Environments: Its small size makes it suitable for deployment in scenarios where larger models might be impractical due to computational or memory limitations.
- Benchmarking & Research: Provides a strong baseline for further research and development in Swedish LLMs.