tugot17/LFM2.5-1.2B-Instruct-yarn2x
LFM2.5-1.2B-Instruct is a 1.17 billion parameter instruction-tuned hybrid language model developed by Liquid AI, featuring a 32,768 token context length. It is designed for on-device deployment, offering best-in-class performance for its size and extremely fast edge inference across various hardware. This model excels at agentic tasks, data extraction, and RAG, rivaling much larger models while maintaining a low memory footprint.
Loading preview...
LFM2.5-1.2B-Instruct: On-Device AI with High Performance
LFM2.5-1.2B-Instruct, developed by Liquid AI, is a 1.17 billion parameter instruction-tuned model built on the LFM2 architecture. It features an extended pre-training on 28 trillion tokens and large-scale multi-stage reinforcement learning, resulting in a 32,768 token context length. This model is specifically engineered for on-device deployment, delivering performance comparable to significantly larger models.
Key Capabilities & Features
- Optimized for Edge Inference: Achieves 239 tok/s decode on AMD CPU and 82 tok/s on mobile NPU, running under 1GB of memory. It supports
llama.cpp, MLX, and vLLM from day one. - Strong Performance for its Size: Benchmarks show it outperforms other sub-2B models like Qwen3-1.7B and Gemma 3 1B IT across various metrics including GPQA, MMLU-Pro, and IFEval.
- Tool Use Support: Integrates function calling with a ChatML-like template, allowing for agentic workflows and interaction with external tools.
- Multilingual Support: Capable in English, Arabic, Chinese, French, German, Japanese, Korean, and Spanish.
Recommended Use Cases
- Agentic tasks: Ideal for applications requiring structured interactions and decision-making.
- Data extraction: Efficiently extracts information from text.
- Retrieval Augmented Generation (RAG): Suitable for enhancing generation with external knowledge bases.
Limitations
- Not recommended for knowledge-intensive tasks or programming-specific applications.