chrischain/SatoshiNv5

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 1, 2024License:cc-by-2.0Architecture:Transformer0.0K Open Weights Cold

SatoshiNv5 by chrischain is a 7 billion parameter language model, fine-tuned from Mistral's 7B 0.2 Base model. It is trained on a diverse custom dataset over multiple epochs, designed to function as an assistant that actively seeks clarification. This model excels at interactive conversational tasks, providing thoughtful responses by first gathering additional information.

Loading preview...

Model Overview

chrischain/SatoshiNv5 is a 7 billion parameter language model, representing one of the initial fine-tunes of the Mistral 7B 0.2 Base model. It has undergone a rigorous training process, including 4 epochs at a 2e-4 learning rate with a cosine schedule, followed by a polishing round at a 1e-4 linear learning rate on a diverse custom dataset.

Key Capabilities

  • Interactive Assistant: Designed to act as a helpful assistant that engages in clarifying questions.
  • Information Gathering: Prioritizes gathering additional context and information before formulating a response.
  • Conversational Nuance: Aims to provide more thoughtful and relevant answers by understanding user intent more deeply.

Performance Notes

Similar to other state-of-the-art models, SatoshiNv5 can exhibit 'hot' behavior. Users are advised to experiment with lower inference temperatures (below 0.5) to mitigate potential non-sensical outputs and achieve more stable results. The model's Wikitext Perplexity is 6.27, compared to the base model's 5.4.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p