Name: jwkirchenbauer/L3-1-8B-Magpie-MTP API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: jwkirchenbauer

Overview

The jwkirchenbauer/L3-1-8B-Magpie-MTP is an 8 billion parameter language model that introduces a novel Multi-Token Prediction (MTP) objective. Unlike standard autoregressive models that generate one token at a time, this model can predict multiple future tokens (k) in a single forward pass, significantly accelerating inference.

Key Capabilities

Accelerated Inference: Utilizes a custom generate() implementation to predict k tokens simultaneously, bypassing the need for auxiliary draft models.
Adaptive Decoding: Features an adaptive mode (ConfAdapt) that dynamically adjusts the number of predicted tokens based on the model's confidence, balancing speed and accuracy.
Custom Generation API: Requires trust_remote_code=True to enable its specialized generation logic, offering flexible control over decoding strategies.
Configurable Strategies: Supports fixed-K generation for consistent acceleration and adaptive strategies like conf_adapt for nearly lossless, variable acceleration.

Usage Notes

To leverage MTP, users must pass do_mtp=True to the generate() function and specify the correct mask_id and eos_id for the model. The MTP generation currently supports single-example generation only, without batching.

Overview

Overview

Key Capabilities

Usage Notes

Full Model Card (README)