LFM2.5-1.2B-Instruct: On-Device AI with High Performance

LFM2.5-1.2B-Instruct, developed by Liquid AI, is a 1.17 billion parameter instruction-tuned model built on the LFM2 architecture. It features an extended pre-training on 28 trillion tokens and large-scale multi-stage reinforcement learning, resulting in a 32,768 token context length. This model is specifically engineered for on-device deployment, delivering performance comparable to significantly larger models.

Key Capabilities & Features

Optimized for Edge Inference: Achieves 239 tok/s decode on AMD CPU and 82 tok/s on mobile NPU, running under 1GB of memory. It supports llama.cpp, MLX, and vLLM from day one.
Strong Performance for its Size: Benchmarks show it outperforms other sub-2B models like Qwen3-1.7B and Gemma 3 1B IT across various metrics including GPQA, MMLU-Pro, and IFEval.
Tool Use Support: Integrates function calling with a ChatML-like template, allowing for agentic workflows and interaction with external tools.
Multilingual Support: Capable in English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

Recommended Use Cases

Agentic tasks: Ideal for applications requiring structured interactions and decision-making.
Data extraction: Efficiently extracts information from text.
Retrieval Augmented Generation (RAG): Suitable for enhancing generation with external knowledge bases.

Limitations

Not recommended for knowledge-intensive tasks or programming-specific applications.

Overview

LFM2.5-1.2B-Instruct: On-Device AI with High Performance

Key Capabilities & Features

Recommended Use Cases

Limitations

Full Model Card (README)