LiquidAI/LFM2-350M

TEXT GENERATIONConcurrency Cost:1Model Size:0.35BQuant:BF16Ctx Length:32kPublished:Jul 10, 2025License:lfm1.0Architecture:Transformer0.2K Cold

LFM2-350M is a 0.35 billion parameter hybrid model developed by Liquid AI, featuring a novel architecture with multiplicative gates and short convolutions. Designed for edge AI and on-device deployment, it offers 3x faster training and 2x faster inference speeds on CPU compared to Qwen3. This model excels in quality, speed, and memory efficiency, outperforming similarly-sized models across various benchmarks including knowledge, mathematics, instruction following, and multilingual capabilities.

Loading preview...

LFM2-350M: A Hybrid Model for Edge AI

LFM2-350M is a 0.35 billion parameter model from Liquid AI's new generation of hybrid models, specifically engineered for edge AI and on-device deployment. It introduces a novel architecture combining multiplicative gates and short convolutions, resulting in significant performance gains.

Key Capabilities & Performance

  • Optimized Speed: Achieves 3x faster training and 2x faster decode and prefill speeds on CPU compared to Qwen3, making it highly efficient for resource-constrained environments.
  • Superior Performance: Outperforms other models of similar size across multiple benchmarks, including knowledge, mathematics, instruction following, and multilingual tasks.
  • Flexible Deployment: Designed to run efficiently on CPU, GPU, and NPU hardware, enabling deployment on diverse devices like smartphones, laptops, and vehicles.
  • Multilingual Support: Supports English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
  • Tool Use: Features robust tool use capabilities, allowing for complex agentic tasks through JSON function definitions and Pythonic function calls.

Recommended Use Cases

LFM2-350M is particularly suited for fine-tuning on narrow use cases to maximize performance. It is recommended for:

  • Agentic tasks
  • Data extraction
  • RAG (Retrieval Augmented Generation)
  • Creative writing
  • Multi-turn conversations

Due to its small size, it is not recommended for knowledge-intensive tasks or those requiring extensive programming skills without specific fine-tuning.