LiquidAI/LFM2.5-1.2B-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:1.2BQuant:BF16Ctx Length:32kPublished:Jan 6, 2026License:lfm1.0Architecture:Transformer0.6K Cold

LiquidAI/LFM2.5-1.2B-Instruct is a 1.2 billion parameter instruction-tuned hybrid model from Liquid AI, designed for efficient on-device deployment. It features a 32,768-token context length and was trained on 28 trillion tokens, incorporating large-scale multi-stage reinforcement learning. This model offers best-in-class performance for its size, rivaling larger models, and excels at fast edge inference across various devices, making it suitable for agentic tasks and data extraction.

Loading preview...

LFM2.5-1.2B-Instruct: On-Device AI Powerhouse

LFM2.5-1.2B-Instruct is a 1.2 billion parameter instruction-tuned model developed by Liquid AI, part of the LFM2.5 family of hybrid models. It is specifically engineered for on-device deployment, offering high performance in a compact footprint. The model builds on the LFM2 architecture with significantly extended pre-training (28 trillion tokens) and advanced multi-stage reinforcement learning.

Key Capabilities & Features

  • Best-in-class performance for its size: Rivals much larger models, enabling high-quality AI on resource-constrained devices.
  • Fast edge inference: Achieves 239 tok/s decode on AMD CPU and 82 tok/s on mobile NPU, running under 1GB of memory. Supports llama.cpp, MLX, and vLLM from day one.
  • Long context window: Features a 32,768-token context length.
  • Multilingual support: Handles English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
  • Tool Use: Supports function calling with a flexible template for integrating external tools, outputting Pythonic or JSON function calls.
  • Optimized formats: Available in native, GGUF, ONNX, and MLX formats for diverse deployment scenarios, including Apple Silicon.

Ideal Use Cases

  • Agentic tasks
  • Data extraction
  • Retrieval Augmented Generation (RAG)
  • On-device applications across mobile, IoT, vehicles, and embedded systems.

It is important to note that this model is not recommended for knowledge-intensive tasks or programming.