Overview
neph1/bellman-7b-mistral-instruct is a 7 billion parameter instruction-tuned model built upon the Mistral architecture. Developed by neph1, this model underwent several rounds of fine-tuning on a custom dataset, with a particular focus on Swedish language domains. The goal was to imbue the model with a deeper understanding of Swedish nuances, leveraging Mistral's existing world data training.
Key Capabilities
- Swedish Language Proficiency: Specifically fine-tuned to excel in generating coherent and verbose responses in Swedish.
- Instruction Following: Designed to follow instruct-type Q&A prompts effectively.
- Mistral Base: Benefits from the robust foundational capabilities of the Mistral architecture.
Training Details
The model was initially fine-tuned for 5 epochs, followed by another 5 epochs on an expanded dataset of 3600 instruct-type Q&A rows. The developer noted initial overfitting, with plans to expand the dataset further and lower the learning rate to mitigate this. The model is intended to be used with standard Llama2 and Mistral prompting, i.e., [INST] [/INST].
Good For
- Applications requiring high-quality, verbose text generation in Swedish.
- Instruction-based tasks where the primary language is Swedish.
- Exploring fine-tuned Mistral models with a specific linguistic focus.