Overview
This model, mrm8488/mistral-7b-ft-h4-no_robots_instructions, is a fine-tuned variant of the Mistral-7B-v0.1 base model, developed by Manuel Romero. It has been specifically adapted for instruction-following tasks using the HuggingFaceH4/no_robots dataset. The fine-tuning process utilized LoRA (Low-Rank Adaptation) with the PEFT library and TRL's SFTTrainer, running for one epoch on an A100 GPU.
Key Capabilities
- Instruction Following: Optimized to understand and execute explicit instructions provided in prompts.
- Efficient Fine-tuning: Achieved through LoRA PEFT technique, making it resource-friendly for adaptation.
- Mistral Architecture: Benefits from the strong base performance of the Mistral-7B model.
Training Details
The model was trained with a learning rate of 0.0002, a batch size of 2, and a gradient accumulation of 64, resulting in a total effective batch size of 128. The training involved 2 epochs with a cosine learning rate scheduler and mixed-precision training. Validation loss decreased from 1.774305 to 1.658270 over 60 steps.
Good For
- Applications requiring precise instruction adherence.
- Building chatbots or virtual assistants that need to follow user commands.
- Generating structured text based on specific prompts.