Nemo-Instruct-2407-MPOA-v4-12B Overview
This model, developed by grimjim, incorporates Magnitude-Preserving Orthogonalized Ablation (MPOA) across layers 10-34, specifically targeting mlp.down_proj.weight and self_attn.o_proj.weight streams. This unique application of MPOA aims to preserve magnitude while orthogonalizing ablations, influencing the model's generative characteristics.
Key Characteristics
- MPOA Integration: Utilizes Magnitude-Preserving Orthogonalized Ablation on critical layers for distinct performance tuning.
- Balanced Safety Refusals: The model is noted to be near an "edge of chaos" regarding safety refusals, suggesting a design choice for less restrictive output compared to highly compliant models.
- Multilingual Training: Trained with harmless and harmful prompt sets in Chinese, English, and French, ensuring robust performance across these languages.
- Coherent English Generation: Despite the multilingual training, the model maintains coherent English text generation.
Good for
- Varied Text Completion: Its design, particularly the approach to safety refusals, makes it suitable for diverse and less constrained text generation tasks.
- Multilingual Applications: Effective for applications requiring text generation in English, Chinese, and French due to its training data composition.