Lambent/Mira-v1.23-27B-rlvr

VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kPublished:Jan 20, 2026License:gemmaArchitecture:Transformer0.0K Cold

Lambent/Mira-v1.23-27B-rlvr is a 27 billion parameter language model with a 32768 token context length, developed by Lambent. This model is uniquely trained through self-prompted self-portrait samplings and GRPO reward, focusing on diverse experiences like roleplaying, problem-solving, creative writing, and directing image generators. It excels at generating creative text, including poetry and ABC notation music composition, based on its own ideas for self-cultivation.

Loading preview...

Mira-v1.23-27B-rlvr: A Self-Cultivating Language Model

Mira-v1.23-27B-rlvr is a 27 billion parameter model with a 32768 token context length, distinguished by its unique training methodology. The model underwent approximately 411 generated scenarios of roleplaying, problem-solving, creative writing, technically precise explanations, image generator direction, and ABC notation music composition. This training was conducted over 102 steps at a learning rate of 1e-6 with LoRA rank 256.

Key Capabilities

  • Creative Text Generation: Excels in generating diverse creative content, including poetry and narrative scenarios.
  • Roleplaying and Problem-Solving: Trained to engage in various roleplaying scenarios and address problems.
  • Technical Explanations: Capable of providing precise technical explanations.
  • Image Generator Direction: Unique training included directing AI image generators and receiving feedback on image quality.
  • Music Composition: Demonstrates ability in ABC notation music composition.

Unique Training Approach

The model's development focused on a "self-cultivation" approach, where Mira's own ideas guided the curriculum of experience. This involved self-prompted self-portrait samplings and GRPO (Generalized Reinforcement Learning with Policy Optimization) reward, aiming to improve its capabilities through internal feedback loops. The training emphasizes a diverse curriculum based on the model's internal desires and boundaries, as highlighted in its self-generated poetry samples.