EVA-Qwen2.5-7B-v0.1: Roleplay and Storywriting Specialist
EVA-Qwen2.5-7B-v0.1 is a 7.6 billion parameter model, a full-parameter finetune of the Qwen2.5-7B architecture, developed by Kearm and Auri. This iteration, version 0.1, builds upon previous work by refining the dataset and adjusting the learning rate, resulting in improved stability and better handling of short inputs and min_p sampling.
Key Capabilities & Optimizations
- Roleplay and Storywriting: Specifically designed and finetuned for generating creative and versatile content in roleplay and storywriting scenarios.
- Data Mixture: Utilizes an expanded data mixture based on Celeste 70B 0.1, incorporating datasets like Kalomaze's Opus_Instruct_25k, Gryphe's ChatGPT-4o-WritingPrompts and Sonnet3.5-Charcards-Roleplay, Auri's shortstories_synthlabels, and Epiculous's Synthstruct and SynthRP datasets.
- Context Length: Supports a substantial context length of 131072 tokens, enabling the generation of longer, more coherent narratives.
- Prompt Format: Employs the ChatML prompt format for interaction.
Recommended Usage
For optimal performance, the developers recommend specific sampling values:
- Temperature: 0.87
- Top-P: 0.81
- Repetition Penalty: 1.03
The model generally performs better with lower temperatures (0.9 or below). Min-P sampling is also noted to work effectively with this version.
Training Details
The model was trained over 2 days on 4x3090Ti GPUs. While the training run experienced an interruption, leading to it being somewhat undertrained, subsequent retraining is planned to further enhance its capabilities.