HenryJJ/Instruct_Mistral-7B-v0.1_Dolly15K

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 2, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Instruct_Mistral-7B-v0.1_Dolly15K is a 7 billion parameter instruction-tuned causal language model developed by HenryJJ. Fine-tuned from the Mistral-7B-v0.1 architecture using the Dolly15K dataset, it is designed for general English language instruction-following tasks. The model was trained for 2.0 epochs with a 1024 token context window, making it suitable for various conversational and generative applications.

Loading preview...

Overview

HenryJJ/Instruct_Mistral-7B-v0.1_Dolly15K is a 7 billion parameter instruction-tuned language model. It is based on the Mistral-7B-v0.1 architecture, which itself is derived from the Llama 2 transformer architecture. The model was fine-tuned by HenryJJ using the Dolly15K dataset for 2.0 epochs, with 90% of the dataset used for training and 10% for validation. It supports an English language context window of 1024 tokens.

Key Capabilities

  • Instruction Following: The model is designed to follow instructions, as indicated by its fine-tuning on the Dolly15K dataset, which is known for its instruction-response pairs.
  • General Purpose Text Generation: Capable of generating text based on given prompts, with or without context, as demonstrated by the provided prompt templates.
  • Open-source Training: The training script used for this model is fully open-sourced, allowing for transparency and reproducibility.

Performance Highlights

Recent evaluations show the model achieving an overall accuracy of 0.624 and a normalized accuracy of 0.629 across various benchmarks. Notable scores include:

  • HellaSwag: 0.826 acc_norm
  • High School Government and Politics: 0.844 acc_norm
  • Marketing: 0.858 acc_norm

Good for

  • Developers looking for a 7B parameter model fine-tuned for instruction-following in English.
  • Applications requiring general text generation and conversational capabilities.
  • Researchers interested in models fine-tuned on the Dolly15K dataset.