chrischain/SatoshiNv5
SatoshiNv5 by chrischain is a 7 billion parameter language model, fine-tuned from Mistral's 7B 0.2 Base model. It is trained on a diverse custom dataset over multiple epochs, designed to function as an assistant that actively seeks clarification. This model excels at interactive conversational tasks, providing thoughtful responses by first gathering additional information.
Loading preview...
Model Overview
chrischain/SatoshiNv5 is a 7 billion parameter language model, representing one of the initial fine-tunes of the Mistral 7B 0.2 Base model. It has undergone a rigorous training process, including 4 epochs at a 2e-4 learning rate with a cosine schedule, followed by a polishing round at a 1e-4 linear learning rate on a diverse custom dataset.
Key Capabilities
- Interactive Assistant: Designed to act as a helpful assistant that engages in clarifying questions.
- Information Gathering: Prioritizes gathering additional context and information before formulating a response.
- Conversational Nuance: Aims to provide more thoughtful and relevant answers by understanding user intent more deeply.
Performance Notes
Similar to other state-of-the-art models, SatoshiNv5 can exhibit 'hot' behavior. Users are advised to experiment with lower inference temperatures (below 0.5) to mitigate potential non-sensical outputs and achieve more stable results. The model's Wikitext Perplexity is 6.27, compared to the base model's 5.4.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.