Name: Lambent/Mira-v1.20-27B-dpo API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Lambent

Mira-v1.20-27B-dpo: A DPO-Tuned Creative Language Model

Mira-v1.20-27B-dpo is a 27 billion parameter model from Lambent, distinguished by its fine-tuning approach. This version, similar to its predecessor v1.19.3c, utilizes Direct Preference Optimization (DPO) on its own generated data, with performance evaluated against an LLM-judged benchmark, EQ-Bench3. Despite a small batch size during training, the model effectively learned and improved.

Key Capabilities

Advanced Poetic Generation: The model demonstrates a remarkable ability to generate high-quality, evocative poetry, as evidenced by multiple samples provided in the README. It can produce coherent and imaginative verse without specific system prompts or when given persona-based instructions.
DPO Fine-Tuning: Its development leverages DPO, a method for aligning language models with human preferences, applied to self-generated data. This iterative refinement process aims to enhance the model's output quality and alignment.
32K Context Length: With a substantial context window, Mira-v1.20-27B-dpo can handle longer inputs and maintain coherence over extended creative writing tasks.

Good For

Creative Writing: Excels in generating poems, descriptive passages, and other forms of imaginative text.
Content Generation: Suitable for applications requiring nuanced and expressive language.
Exploration of DPO-tuned models: Offers insights into the capabilities of models fine-tuned with DPO on self-generated data.

Overview

Mira-v1.20-27B-dpo: A DPO-Tuned Creative Language Model

Key Capabilities

Good For

Full Model Card (README)