tugot17/LFM2.5-1.2B-Instruct-yarn2x

TEXT GENERATIONConcurrency Cost:1Model Size:1.2BQuant:BF16Ctx Length:32kPublished:May 22, 2026License:lfm1.0Architecture:Transformer Cold

LFM2.5-1.2B-Instruct is a 1.17 billion parameter instruction-tuned hybrid language model developed by Liquid AI, featuring a 32,768 token context length. It is designed for on-device deployment, offering best-in-class performance for its size and extremely fast edge inference across various hardware. This model excels at agentic tasks, data extraction, and RAG, rivaling much larger models while maintaining a low memory footprint.

Loading preview...

LFM2.5-1.2B-Instruct: On-Device AI with High Performance

LFM2.5-1.2B-Instruct, developed by Liquid AI, is a 1.17 billion parameter instruction-tuned model built on the LFM2 architecture. It features an extended pre-training on 28 trillion tokens and large-scale multi-stage reinforcement learning, resulting in a 32,768 token context length. This model is specifically engineered for on-device deployment, delivering performance comparable to significantly larger models.

Key Capabilities & Features

  • Optimized for Edge Inference: Achieves 239 tok/s decode on AMD CPU and 82 tok/s on mobile NPU, running under 1GB of memory. It supports llama.cpp, MLX, and vLLM from day one.
  • Strong Performance for its Size: Benchmarks show it outperforms other sub-2B models like Qwen3-1.7B and Gemma 3 1B IT across various metrics including GPQA, MMLU-Pro, and IFEval.
  • Tool Use Support: Integrates function calling with a ChatML-like template, allowing for agentic workflows and interaction with external tools.
  • Multilingual Support: Capable in English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.

Recommended Use Cases

  • Agentic tasks: Ideal for applications requiring structured interactions and decision-making.
  • Data extraction: Efficiently extracts information from text.
  • Retrieval Augmented Generation (RAG): Suitable for enhancing generation with external knowledge bases.

Limitations

  • Not recommended for knowledge-intensive tasks or programming-specific applications.