Name: LiquidAI/LFM2.5-350M API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: LiquidAI

LFM2.5-350M: On-Device Hybrid Language Model

LFM2.5-350M is a 350 million parameter hybrid model developed by Liquid AI, specifically engineered for on-device deployment and efficient edge inference. It extends the LFM2 architecture through significant pre-training (28T tokens) and large-scale multi-stage reinforcement learning, enabling it to deliver performance comparable to much larger models while operating under 1GB of memory.

Key Capabilities & Features

Optimized for Edge: Achieves fast decode speeds (e.g., 313 tok/s on AMD CPU, 188 tok/s on Snapdragon Gen4) with day-one support for llama.cpp, MLX, and vLLM.
Compact yet Powerful: A 350M parameter model with a 32,768 token context length, offering strong performance for its size.
Multilingual Support: Supports English, Arabic, Chinese, French, German, Japanese, Korean, Portuguese, and Spanish.
Tool Use & Function Calling: Features robust support for function calling, allowing the model to interact with external tools and interpret their outcomes.
Broad Inference Support: Available in multiple formats including native, GGUF, ONNX, MLX, and OpenVINO for diverse hardware and deployment scenarios.

Good For

Data Extraction: Efficiently extracting specific information from text.
Structured Outputs: Generating responses in predefined formats.
Tool Use: Applications requiring function calling and interaction with external systems.
On-Device & Edge Deployment: Ideal for scenarios where resources are limited, such as mobile or embedded systems, due to its small footprint and optimized inference.

Overview

LFM2.5-350M: On-Device Hybrid Language Model

Key Capabilities & Features

Good For

Full Model Card (README)