CorticalStack/mistral-7b-dolphin-sft
CorticalStack/mistral-7b-dolphin-sft is a 7 billion parameter language model, fine-tuned from unsloth/mistral-7b-bnb-4bit using the cognitivecomputations/dolphin dataset. This model specializes in instruction following, leveraging its SFT (Supervised Fine-Tuning) on a diverse conversational dataset. It is optimized for efficient deployment and inference, making it suitable for applications requiring responsive and accurate text generation.
Loading preview...
Overview
CorticalStack/mistral-7b-dolphin-sft is a 7 billion parameter language model derived from unsloth/mistral-7b-bnb-4bit. It has undergone Supervised Fine-Tuning (SFT) using the cognitivecomputations/dolphin dataset, which is known for its diverse and high-quality conversational data. This fine-tuning process aims to enhance the model's ability to follow instructions and generate coherent, contextually relevant responses.
Key Characteristics
- Base Model:
unsloth/mistral-7b-bnb-4bit - Fine-tuning Dataset:
cognitivecomputations/dolphin - Parameter Count: 7 billion
- Context Length: 8192 tokens
- Training Framework: Utilizes Unsloth and Huggingface's TRL library for efficient fine-tuning.
Fine-tuning Configuration Highlights
- LoRA Parameters:
r=256,alpha=128,dropout=0.0for efficient adaptation. - Training Epochs: 1 epoch with a batch size of 4 and 6 gradient accumulation steps.
- Learning Rate:
0.0002with a linear scheduler. - Max Sequence Length: 2048 during training.
- Quantization: Trained with 4-bit bnb (BitsAndBytes) for reduced memory footprint.
Potential Use Cases
This model is well-suited for applications requiring a compact yet capable instruction-following model, such as:
- Chatbots and conversational AI.
- Text generation based on specific prompts.
- Summarization and question-answering tasks where instruction adherence is crucial.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.