Nexesenex/Codellama-2-7b-Miniguanaco-Mistral is a 7 billion parameter language model built upon the CodeLlama 2 architecture. This model integrates a Guanaco Lora and a Mistral AI 7B 0.1 delta, aiming to combine their respective strengths. It is primarily intended for experimental use and amusement, leveraging the CodeLlama base's 16k training context and Mistral's 8k context.
Loading preview...
Model Overview
Nexesenex/Codellama-2-7b-Miniguanaco-Mistral is an experimental 7 billion parameter language model. It is constructed by merging several components: a CodeLlama 2 7B base, a Guanaco Lora (originally from Tim Dettmers and merged by Varunk29), and a Mistral AI 7B 0.1 delta (extracted by Undi95 and merged by Nexesenex).
Key Characteristics
- Base Architecture: CodeLlama 2 7B, known for its code-related capabilities.
- Merged Components: Integrates a Guanaco Lora and a Mistral AI 7B 0.1 delta, suggesting an attempt to enhance its performance or introduce new characteristics.
- Context Length: The base CodeLlama model was trained with a 16k context, potentially supporting up to 96k with its base ROPE. The Mistral injection component was trained with an 8k context, though the effectiveness of Sliding Window Attention in this merged configuration is noted as potentially inoperable.
Intended Use
This model is explicitly stated to be "For test and amusement only," indicating its experimental nature. It is not presented as a production-ready solution but rather a fusion for exploration and evaluation. It supports Alpaca-style prompting.