Name: ArcadaLabs/Ouro-2.6B-Thinking-mlx-bf16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ArcadaLabs

ArcadaLabs/Ouro-2.6B-Thinking-mlx-bf16 Overview

This model is an unquantized (bf16) MLX conversion of the ByteDance/Ouro-2.6B-Thinking model, a 2.6 billion parameter Looped Language Model (LoopLM) or Universal Transformer. Its core innovation lies in its chain-of-thought reasoning capability, where it explicitly generates internal <think> tokens before formulating a final response. This process is powered by architectural recurrence, effectively using 24 physical layers as 96 effective layers through looping.

Key Capabilities & Features

Explicit Reasoning: Generates a visible chain-of-thought, allowing users to observe the model's reasoning process.
MLX Optimization: Converted for efficient inference on Apple Silicon, demonstrating significantly higher token throughput (11.9 tok/s) compared to PyTorch fp16 on MPS (5.0 tok/s) in informal benchmarks.
Looped Architecture: Utilizes recurrent looping over transformer blocks, enabling deeper processing with fewer physical layers.
High Context Length: Trained with a 4K context length, extendable up to 64K tokens.
Full Precision: This specific conversion maintains bfloat16 precision, offering a balance between performance and numerical stability.

When to Use This Model

Reasoning-Intensive Tasks: Ideal for applications requiring transparent, step-by-step reasoning, such as complex problem-solving or analysis.
Apple Silicon Deployment: Excellent choice for developers targeting Apple devices due to its MLX optimization.
Understanding Model Thought Process: Useful for research or debugging where insight into the model's internal deliberation is beneficial.
Resource-Efficient Deep Processing: Its looped architecture allows for deep processing with a relatively smaller parameter count, making it efficient for certain tasks.

Overview

ArcadaLabs/Ouro-2.6B-Thinking-mlx-bf16 Overview

Key Capabilities & Features

When to Use This Model

Full Model Card (README)