Delta-Vector/Control-Nanuq-8B
Delta-Vector/Control-Nanuq-8B is an 8 billion parameter language model, fine-tuned from LLaMA 3.1 8B Supernova. It is specifically designed to minimize narration and produce concise responses, making it suitable for applications requiring direct and brief outputs. The model incorporates DPO and KTO reinforcement learning to enhance coherence, prose, and creativity, while maintaining a focus on brevity. Its primary strength lies in generating 'short and sweet' interactions.
Loading preview...
Model Overview
Delta-Vector/Control-Nanuq-8B is an 8 billion parameter model derived from LLaMA 3.1 8B Supernova. Its core design philosophy is to deliver "short and sweet" responses by minimizing excessive narration and lengthy outputs, making it highly efficient for direct communication.
Key Capabilities & Training
- Concise Responses: Specifically fine-tuned to reduce verbosity, providing direct and brief answers.
- Enhanced Coherence & Creativity: Utilizes DPO (Direct Preference Optimization) and KTO (Kahneman-Tversky Optimization) reinforcement learning, the latter implemented with the help of Jeiku, to significantly improve its prose and creative generation while maintaining brevity.
- Llama-Instruct Formatting: Tuned to work optimally with Llama-Instruct prompting, ensuring consistent and predictable input/output behavior.
- Flexible System Prompting: Recommends using Euryale's or EVA's system prompts for optimal performance, particularly for narrative-driven role-play and character embodiment.
Good For
- Applications requiring brief, direct, and non-verbose AI interactions.
- Scenarios where minimizing AI-generated narration is crucial.
- Role-playing or conversational agents that need to maintain character and narrative consistency without excessive verbosity.
Technical Details
The model underwent 4 epochs of fine-tuning using OpenCAI and RP logs. The training leveraged a combination of GPUs: 4x RTX 3090s for full-parameter fine-tuning, 1x Nvidia T4 for DPO, and 1x H100 for KTO reinforcement learning.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.