Overview
HenryJJ/Instruct_Mistral-7B-v0.1_Dolly15K is a 7 billion parameter instruction-tuned language model. It is based on the Mistral-7B-v0.1 architecture, which itself is derived from the Llama 2 transformer architecture. The model was fine-tuned by HenryJJ using the Dolly15K dataset for 2.0 epochs, with 90% of the dataset used for training and 10% for validation. It supports an English language context window of 1024 tokens.
Key Capabilities
- Instruction Following: The model is designed to follow instructions, as indicated by its fine-tuning on the Dolly15K dataset, which is known for its instruction-response pairs.
- General Purpose Text Generation: Capable of generating text based on given prompts, with or without context, as demonstrated by the provided prompt templates.
- Open-source Training: The training script used for this model is fully open-sourced, allowing for transparency and reproducibility.
Performance Highlights
Recent evaluations show the model achieving an overall accuracy of 0.624 and a normalized accuracy of 0.629 across various benchmarks. Notable scores include:
- HellaSwag: 0.826 acc_norm
- High School Government and Politics: 0.844 acc_norm
- Marketing: 0.858 acc_norm
Good for
- Developers looking for a 7B parameter model fine-tuned for instruction-following in English.
- Applications requiring general text generation and conversational capabilities.
- Researchers interested in models fine-tuned on the Dolly15K dataset.