Lambent/Mira-v1.21-27B-rlvr
VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kLicense:gemmaArchitecture:Transformer0.0K Cold

Lambent/Mira-v1.21-27B-rlvr is a 27 billion parameter language model developed by Lambent, fine-tuned using Generative Reinforcement Learning from Policy Optimization (GRPO) in a unique RL environment. This model specializes in one-shot roleplaying scenarios, demonstrating a strong capacity for voice, humor, and cleverness. With a 32768 token context length, it is optimized for creative and nuanced conversational interactions.

Loading preview...