CorticalStack/mistral-7b-openhermes-sft
CorticalStack/mistral-7b-openhermes-sft is a 7 billion parameter language model, fine-tuned from unsloth/mistral-7b-bnb-4bit using the teknium/openhermes dataset. This model leverages an SFT (Supervised Fine-Tuning) approach, making it suitable for general conversational and instruction-following tasks. It was trained with a maximum sequence length of 2048 tokens, focusing on efficient performance with 4-bit quantization.
Loading preview...
CorticalStack/mistral-7b-openhermes-sft Overview
This model is a 7 billion parameter language model, specifically a Supervised Fine-Tuned (SFT) version of the unsloth/mistral-7b-bnb-4bit base model. It was fine-tuned using the teknium/openhermes dataset, which is known for its high-quality instruction-following examples, making this model well-suited for a variety of conversational and task-oriented applications.
Key Fine-tuning Details
The fine-tuning process utilized Unsloth and Huggingface's TRL library, focusing on efficiency and performance. Notable configuration parameters include:
- LoRA Configuration: Employed LoRA with
r=256,LoRA alpha=128, andLoRA dropout=0.0for efficient adaptation. - Training Arguments: Trained for 1 epoch with a batch size of 4, gradient accumulation steps of 6, and a learning rate of 0.0002. The maximum sequence length during training was 2048 tokens.
- Quantization: Leverages 4-bit bnb (bitsandbytes) quantization, contributing to reduced memory footprint and faster inference.
Use Cases
Given its SFT on the OpenHermes dataset, this model is particularly effective for:
- Instruction Following: Generating responses based on explicit instructions.
- General Chatbot Applications: Engaging in conversational dialogues.
- Text Generation: Creating coherent and contextually relevant text for various prompts.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.