Overview
wave-on-discord/silly-v0.2 is a 12 billion parameter language model, built upon the Mistral-Nemo-Base-2407 architecture. Its primary objective is to replicate the distinctive writing style commonly found in character.ai models, making it a specialized tool for interactive and narrative-driven applications.
Key Capabilities
- Roleplay Emulation: Specifically fine-tuned to generate responses that mimic character.ai's conversational and narrative style.
- Contextual Coherence: Demonstrates strong ability to maintain context and coherence in roleplaying scenarios.
- Conversation Initiation: Capable of effectively starting and developing conversations.
- Writing Quality: Users report high-quality and on-task output, particularly for a 12B parameter model.
- ChatML Format: Utilizes the ChatML chat format for structured interactions.
Training Methodology
The model was developed through a two-stage training process:
- Supervised Fine-Tuning (SFT): Two epochs of SFT were performed using roleplay-specific datasets.
- Reinforcement Learning (PPO): Approximately one hour of Proximal Policy Optimization (PPO) was conducted on 8xH100 GPUs, leveraging the POLAR-7B RFT reward model. This showcases POLAR's utility for "out of distribution" tasks like roleplaying.
Considerations for Use
While effective, users should be aware that for longer messages, decreasing the temperature setting may be necessary to mitigate "wonky" or less stable outputs. The model may exhibit some "small model stupid" tendencies, occasionally inventing or forgetting minor details, suggesting a potentially smaller effective context window despite its technical capacity.