CorticalStack/mistral-7b-dolphin-sft

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 17, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

CorticalStack/mistral-7b-dolphin-sft is a 7 billion parameter language model, fine-tuned from unsloth/mistral-7b-bnb-4bit using the cognitivecomputations/dolphin dataset. This model specializes in instruction following, leveraging its SFT (Supervised Fine-Tuning) on a diverse conversational dataset. It is optimized for efficient deployment and inference, making it suitable for applications requiring responsive and accurate text generation.

Loading preview...

Overview

CorticalStack/mistral-7b-dolphin-sft is a 7 billion parameter language model derived from unsloth/mistral-7b-bnb-4bit. It has undergone Supervised Fine-Tuning (SFT) using the cognitivecomputations/dolphin dataset, which is known for its diverse and high-quality conversational data. This fine-tuning process aims to enhance the model's ability to follow instructions and generate coherent, contextually relevant responses.

Key Characteristics

  • Base Model: unsloth/mistral-7b-bnb-4bit
  • Fine-tuning Dataset: cognitivecomputations/dolphin
  • Parameter Count: 7 billion
  • Context Length: 8192 tokens
  • Training Framework: Utilizes Unsloth and Huggingface's TRL library for efficient fine-tuning.

Fine-tuning Configuration Highlights

  • LoRA Parameters: r=256, alpha=128, dropout=0.0 for efficient adaptation.
  • Training Epochs: 1 epoch with a batch size of 4 and 6 gradient accumulation steps.
  • Learning Rate: 0.0002 with a linear scheduler.
  • Max Sequence Length: 2048 during training.
  • Quantization: Trained with 4-bit bnb (BitsAndBytes) for reduced memory footprint.

Potential Use Cases

This model is well-suited for applications requiring a compact yet capable instruction-following model, such as:

  • Chatbots and conversational AI.
  • Text generation based on specific prompts.
  • Summarization and question-answering tasks where instruction adherence is crucial.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p