Name: KristianS7/Ouro-1.4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: KristianS7

Ouro-1.4B: A Parameter-Efficient Looped Language Model

KristianS7/Ouro-1.4B is a 1.4 billion parameter Looped Language Model (LoopLM) developed by ByteDance, designed for research purposes. It distinguishes itself through exceptional parameter efficiency, capable of matching the performance of larger 3-4 billion parameter standard transformers by employing iterative shared-weight computation.

Key Capabilities & Features

Iterative Latent Reasoning: Performs reasoning through recurrent computation within its latent space.
Adaptive Computation: Supports early exit mechanisms, allowing for dynamic allocation of computational resources based on the task.
Configurable Recurrent Steps: Users can adjust total_ut_steps to balance performance and computation time, and early_exit_threshold for adaptive exit behavior.
Robust Architecture: Based on a decoder-only Transformer with 24 layers, 2048 hidden size, Multi-Head Attention, SwiGLU FFN, RoPE, and Sandwich RMSNorm.
Extensive Training: Trained on 7.7 trillion tokens, including web data, code, mathematics, and long-context documents, with a context length extendable to 64K.

When to Use This Model

Research on Parameter Efficiency: Ideal for exploring methods to achieve high performance with fewer parameters.
Adaptive Computation Studies: Suitable for investigating dynamic compute allocation and early exit strategies in LLMs.
Resource-Constrained Environments: Potentially useful for applications where computational resources are limited, given its efficiency.

Note: This model is intended for research and is provided as-is. The adaptive exit feature is not currently supported by vLLM.

Overview

Ouro-1.4B: A Parameter-Efficient Looped Language Model

Key Capabilities & Features

When to Use This Model

Full Model Card (README)