awnr/Mistral-7B-v0.1-half-naive-A

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 24, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

awrn/Mistral-7B-v0.1-half-naive-A is a 7 billion parameter experimental language model, a modification of the Mistral-7B-v0.1 architecture. Developed by Dr. Alex W. Neal Riasanovsky, this model features replaced weight matrices, with a primary focus on investigating how these adjustments affect performance metrics compared to the original Mistral-7B-v0.1. It is intended for research into neural network weight matrix experimentation and performance analysis.

Loading preview...

Model Overview

awrn/Mistral-7B-v0.1-half-naive-A is an experimental 7 billion parameter language model derived from the mistralai/Mistral-7B-v0.1 architecture. Developed by Dr. Alex W. Neal Riasanovsky, this model incorporates specific modifications where some of the original weight matrices have been replaced. The primary objective behind this modification is to conduct research into the effects of altered weight matrices on the model's performance, particularly in comparison to the original Mistral-7B-v0.1.

Key Characteristics

  • Experimental Modification: This model is a direct clone of Mistral-7B-v0.1 with targeted weight matrix replacements.
  • Research Focus: The project aims to observe how these internal architectural changes influence benchmark values and overall model behavior.
  • Base Model: Built upon the robust Mistral-7B-v0.1 foundation, inheriting its 8192 token context length.
  • License: Distributed under the Apache-2.0 license.

Intended Use and Limitations

This model is currently a research-in-progress artifact. Its primary utility lies in facilitating computational experiments to test hypotheses regarding neural network weight adjustments. Users should be aware that its biases, risks, and limitations are largely unknown and are part of the ongoing research. It is not intended for production use or applications requiring stable, predictable performance, but rather for academic or experimental exploration of model internals.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p