Name: Lambent/Mira-v1.25.2-27B-DPO API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Lambent

Overview

Lambent/Mira-v1.25.2-27B-DPO is a 27 billion parameter language model, representing an iteration built upon the Mira-v1.25.1-27B-DPO base. It was created using a Karcher Mean merge method, combining the base model with several DPO-adapted versions.

Key Characteristics

Second DPO Phase: This version incorporates a second Direct Preference Optimization (DPO) phase. This DPO specifically targets synthetic negative examples that reflect the model's value drift patterns during multi-turn self-examination, aiming to refine its behavior in complex interactions.
Creative Capabilities: The model is noted to retain creative capabilities and intelligence comparable to its immediate predecessor, Mira-v1.25.1-27B-DPO.
Merge Method: The model is a merge of pre-trained language models, utilizing the Karcher Mean method for integration.

Good For

Applications requiring nuanced language generation and understanding.
Use cases where refined multi-turn interaction and self-correction are beneficial.
Tasks that leverage the creative and intelligent capabilities established in the Mira-v1.25.1-27B-DPO series.

Overview

Overview

Key Characteristics

Good For

Full Model Card (README)