Name: ByteDance/Ouro-1.4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ByteDance

Ouro-1.4B: An Efficient Looped Language Model

Ouro-1.4B, developed by ByteDance, is a 1.4 billion parameter Looped Language Model (LoopLM) that achieves significant parameter efficiency. It is designed to match the performance of larger 3-4 billion parameter standard transformers by employing iterative shared-weight computation and recurrent processing in its latent space.

Key Capabilities

Exceptional Parameter Efficiency: Delivers performance comparable to 3-4B parameter models with only 1.4B parameters.
Iterative Latent Reasoning: Utilizes recurrent computation for reasoning tasks, enhancing its analytical capabilities.
Adaptive Computation: Features configurable recurrent steps (total_ut_steps) and an adaptive early exit mechanism (early_exit_threshold) to dynamically manage computational resources based on task complexity. Note that vLLM currently bypasses the adaptive exit feature, always executing full recurrent steps.

Model Architecture & Training

Ouro-1.4B is a decoder-only Transformer with 24 layers, 4 recurrent steps, and a 2048 hidden size. It uses Multi-Head Attention, SwiGLU activation, RoPE for position embeddings, and Sandwich RMSNorm. The model was trained on 7.7 trillion tokens across multiple stages, including pre-training, CT Annealing, long context training, and mid-training, using a diverse dataset comprising web data, code, mathematics, and long-context documents.

Good for

Research into parameter-efficient language models and recurrent computation.
Applications where computational budget is a constraint but performance comparable to larger models is desired.
Experimenting with adaptive computation and early exit strategies in LLMs.

Overview

Ouro-1.4B: An Efficient Looped Language Model

Key Capabilities

Model Architecture & Training

Good for

Full Model Card (README)