Name: Lambent/Arsenic-Shahrazad-12B-v4.4 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Lambent

Model Overview

Lambent/Arsenic-Shahrazad-12B-v4.4 is a 12 billion parameter language model with a 32768 token context length, developed by Lambent. This iteration, a continuation of the 4.3 lineage, focuses on refining writing craft through a sophisticated training methodology. It utilizes Gemma 4 31b as a judge during the training process, ensuring a high standard for generated text.

Training Methodology

The model underwent over 600 steps of RLVR GRPO (Reinforcement Learning from Very-Good Responses with Policy Optimization) specifically targeting 'spicy roleplaying first-turns'. This was followed by DPO (Direct Preference Optimization), which included self-rewrites of problematic trajectories identified during RLVR. The training involved running 5 seeds at a low batch size to explore diverse outputs, which were then merged using the Karcher Mean method. This approach aimed to address issues like 'godmoded POV' and enhance overall narrative quality.

Key Characteristics

Refined Writing Craft: Explicitly trained with a focus on improving the quality and nuance of generated text, particularly in creative and roleplaying contexts.
Advanced Fine-tuning: Leverages a combination of RLVR GRPO and DPO, with a powerful judge model (Gemma 4 31b) guiding the optimization process.
Merged Architecture: Created from a Karcher Mean merge of five distinct training seeds, contributing to a robust and well-rounded model.
Roleplay Optimization: Designed to excel in generating engaging and well-structured first-turns for roleplaying scenarios.

Use Cases

Creative Writing: Ideal for generating narrative content, character dialogue, and descriptive passages with a focus on writing quality.
Roleplaying Applications: Particularly suited for applications requiring nuanced and engaging responses in interactive storytelling or roleplay environments.
Content Generation: Can be used for generating diverse textual content where stylistic quality and narrative depth are important.

Overview

Model Overview

Training Methodology

Key Characteristics

Use Cases

Full Model Card (README)