Model Overview: Sao10K/MN-12B-Lyra-v4a1-Old
This model is an outdated 12 billion parameter iteration from the Lyra series by Sao10K, superseded by Lyra-v4. It is built upon a lineage of previous Lyra versions (v3, v2a2, v2a1), each incorporating distinct training methodologies.
Key Development Steps:
The Lyra series evolved through several stages:
- Lyra-v1: Merged custom roleplay and instruct training on various formats.
- Lyra-v2a1: Incorporated additional Supervised Fine-Tuning (SFT) on a subset of previous data.
- Lyra-v2a2: Applied a low-rank SFT step and tokenizer adjustments.
- Lyra-v3: Underwent an Reinforcement Learning (RL) step on multi-turn datasets, utilizing Lyra-v2a2 for rejected data responses.
- Lyra-v4: Involved a backmerge to v2a1, LoRA extraction, and another low-rank SFT step for improved coherency.
Usage and Recommendations:
This model supports ChatML and its variants for prompting. Users are advised to use specific sampling parameters for optimal performance:
- Temperature: 0.6 - 1
- min_p: 0.1 - 0.2 (crucial for NeMo-based models)
Recommended stopping strings include <|im_end|>, </s>, and [/INST]. The model may exhibit run-on generations, a known characteristic from earlier versions, which can be managed with proper prompting and stopping mechanisms. It is noted that issues with special tokens in v3's configuration might affect quantization tools, but the model runs fine unquantized.