LFM2.5-1.2B-Instruct: On-Device AI

LFM2.5-1.2B-Instruct, developed by Liquid AI, is a 1.2 billion parameter instruction-tuned model optimized for on-device deployment and fast edge inference. It builds upon the LFM2 architecture, featuring extended pre-training on 28 trillion tokens and multi-stage reinforcement learning to achieve performance comparable to much larger models.

Key Capabilities & Features

Efficient On-Device Performance: Achieves 239 tok/s decode on AMD CPU and 82 tok/s on mobile NPU, operating under 1GB of memory. It supports llama.cpp, MLX, and vLLM from day one.
Extended Context Window: Features a substantial 32,768 token context length.
Multilingual Support: Trained on English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
Tool Use: Supports function calling with a Pythonic format, enabling integration with external tools for complex tasks.
Hybrid Architecture: Utilizes 10 double-gated LIV convolution blocks and 6 GQA blocks.

Performance & Benchmarks

LFM2.5-1.2B-Instruct demonstrates strong performance against other sub-2B models across various benchmarks, including GPQA, MMLU-Pro, IFEval, and AIME25, often outperforming competitors in its size class. Its inference speed is particularly notable on CPUs and NPUs, unlocking new deployment scenarios for vehicles, mobile devices, and IoT.

Recommended Use Cases

This model is particularly well-suited for:

Agentic tasks
Data extraction
Retrieval-Augmented Generation (RAG)

It is not recommended for knowledge-intensive tasks or programming.

Overview

LFM2.5-1.2B-Instruct: On-Device AI

Key Capabilities & Features

Performance & Benchmarks

Recommended Use Cases

Full Model Card (README)