wandb/mistral-7b-zephyr-sft

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 9, 2024License:mitArchitecture:Transformer0.0K Open Weights Cold

wandb/mistral-7b-zephyr-sft is a 7.2 billion parameter GPT-like model, fine-tuned from Mistral-7B-v0.1 using the Zephyr SFT recipe. Primarily English, it is optimized for instruction-following tasks based on a mix of publicly available and synthetic datasets. This model is suitable for general-purpose conversational AI applications requiring a compact yet capable language model.

Loading preview...

Model Overview

wandb/mistral-7b-zephyr-sft is a 7.2 billion parameter language model, fine-tuned from the original mistralai/Mistral-7B-v0.1 base model. It leverages the Zephyr Supervised Fine-Tuning (SFT) recipe, adapted to use the ChatML format for improved instruction following. The training process utilized the alignment handbook recipe and was logged with Weights & Biases.

Key Capabilities

  • Instruction Following: Enhanced through the Zephyr SFT recipe and ChatML format, making it proficient in responding to user prompts and instructions.
  • Language Support: Primarily focused on English language tasks.
  • Base Model Strength: Benefits from the strong foundation of the Mistral-7B-v0.1 architecture.

Use Cases

This model is well-suited for applications requiring a capable and efficient language model for:

  • General-purpose conversational AI.
  • Instruction-based text generation.
  • Tasks where a 7B parameter model offers a balance between performance and computational efficiency.