unsloth/LFM2-350M
The LFM2-350M model by Liquid AI is a 0.35 billion parameter hybrid language model with a 32,768 token context length, designed for edge AI and on-device deployment. It features a new architecture with multiplicative gates and short convolutions, offering 3x faster training and 2x faster CPU inference compared to its predecessor. This model excels in quality, speed, and memory efficiency for agentic tasks, data extraction, RAG, creative writing, and multi-turn conversations.
Loading preview...
LFM2-350M: A Hybrid Model for Edge AI
LFM2-350M, developed by Liquid AI, is a 0.35 billion parameter hybrid model specifically engineered for efficient edge AI and on-device deployment. It integrates a novel architecture combining multiplicative gates and short convolutions, featuring 10 double-gated short-range LIV convolution blocks and 6 grouped query attention blocks. This design contributes to its superior performance in speed and memory efficiency.
Key Capabilities & Features
- Optimized Performance: Achieves 3x faster training and 2x faster decode/prefill speeds on CPU compared to Qwen3, outperforming similarly-sized models across various benchmarks including knowledge, mathematics, instruction following, and multilingual tasks.
- Flexible Deployment: Designed to run efficiently on CPU, GPU, and NPU hardware, making it suitable for deployment on smartphones, laptops, and vehicles.
- Multilingual Support: Supports English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
- Tool Use: Incorporates a structured tool-use mechanism, allowing for function definition, calling, execution, and interpretation within conversations.
- Training: Utilizes knowledge distillation from LFM1-7B, large-scale SFT, custom DPO with length normalization, and iterative model merging, trained on 10 trillion tokens.
Recommended Use Cases
LFM2-350M is particularly suited for fine-tuning on narrow use cases to maximize performance. It is recommended for:
- Agentic tasks
- Data extraction
- RAG (Retrieval Augmented Generation)
- Creative writing
- Multi-turn conversations
However, it is not recommended for knowledge-intensive tasks or those requiring programming skills due to its small size.