Name: serving-d-cause/writing-roleplay-20k-context-nemo-12b-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: serving-d-cause

Model Overview

serving-d-cause/writing-roleplay-20k-context-nemo-12b-v1.0 is a 12 billion parameter model based on the Mistral-Nemo-Base-2407 architecture, specialized in creative writing and multi-turn roleplay. It was fine-tuned using a meticulously curated dataset, including synthetic roleplay conversations and storywriting data, with a focus on maintaining coherence over long contexts.

Key Capabilities

Extended Context Roleplay: Trained on self-generated multi-turn roleplay conversations, with the longest examples reaching approximately 20,000 tokens, ensuring consistent narrative flow.
Synthetic Data Generation: Utilizes advanced LLMs like Command-R-Plus and byroneverson/Mistral-Small-Instruct-2409-abliterated to create high-quality synthetic roleplay and storywriting data.
Data Filtering: Employs large models to filter out low-quality, repetitive, and inappropriate content from its training datasets, enhancing output quality.
Storywriting: Incorporates storywriting data derived from sources like aetherroom.club, processed to improve and extend narrative length.

Training Details

The model was trained using QLoRA with a lora_r of 128 and lora_alpha of 256, targeting linear modules including embed_tokens and lm_head. It uses a sequence_len of 20000 and flash_attention for efficiency. The training dataset includes openerotica/mixed-rp, anthracite-org/stheno-filtered-v1.1, anthracite-org/kalo_misc_part2, anthracite-org/kalo_opus_misc_240827, anthracite-org/kalo-opus-instruct-22k-no-refusal, Chaser-cz/sonnet35-charcard-roleplay-sharegpt, and a subset of jondurbin/airoboros-3.2.

Overview

Model Overview

Key Capabilities

Training Details

Full Model Card (README)