aimeri/spoomplesmaxx-base-qwen3-14b
The aimeri/spoomplesmaxx-base-qwen3-14b is a 14 billion parameter continued pre-training (CPT) of the Qwen3-14B-Base model, developed by aimeri. This base model is specifically trained on a curated mix of fiction, character knowledge, prose, and domain-specific corpora, with a context length of 32768 tokens. It is designed as a foundational stage for creative writing, character roleplay, and uncensored conversational AI, emphasizing narrative style and domain knowledge over factual accuracy. This model serves as the initial CPT stage in a pipeline that will include subsequent SFT and DPO for instruction following and alignment.
Loading preview...
Model Overview
This model, aimeri/spoomplesmaxx-base-qwen3-14b, is a 14 billion parameter continued pre-training (CPT) of the Qwen3-14B-Base. It represents the foundational stage of the SpoomplesMaxx training pipeline (CPT → SFT → DPO), focusing on teaching general language patterns, domain knowledge, and writing styles without instruction tuning. The CPT stage grounds the model in character knowledge, narrative prose, multilingual content, and uncensored language.
Key Training Details
The model was trained on a curated dataset (aimeri/spoomplesmaxx-cpt-small-Qwen3-14B-Base) across three curriculum phases, emphasizing:
- Core Knowledge: Characters, lore, world-building (e.g., custom character cards, AO3 works, NSFW prose).
- Domain Prose: Writing quality and narrative style (e.g., Gutenberg prose, LongPage, light novels).
- Language Diversity: Robustness and multilingual content (e.g., toxic conversations, Fandom wiki lore).
Training involved 2 epochs over approximately 782 steps, utilizing 4 NVIDIA H100 GPUs for about 14 hours. The model achieved a train loss of 1.735 and a perplexity of 5.668.
Intended Use Cases (after full pipeline)
- Creative writing and fiction generation.
- Character roleplay with consistent personas.
- Uncensored conversational AI.
- Multilingual content generation (primarily English, Portuguese, some Italian/Spanish).
Limitations
As a base model, it lacks instruction following capabilities and will primarily perform text continuation. Its domain knowledge is biased towards fiction and creative writing, and it contains uncensored/NSFW training data, which may result in explicit outputs. It is not recommended for direct use as a chat assistant or for factual Q&A.