Weni/WeniGPT-Mistral-7B-instructBase
WeniGPT-Mistral-7B-instructBase is a fine-tuned variant of the Mistral-7B-Instruct-v0.1 model, developed by Weni. This instruction-tuned model leverages the Mistral architecture, known for its efficiency and strong performance in its size class. It was trained with specific hyperparameters including a learning rate of 0.0004 and 8000 training steps, making it suitable for general instruction-following tasks.
Loading preview...
WeniGPT-Mistral-7B-instructBase Overview
WeniGPT-Mistral-7B-instructBase is an instruction-tuned language model developed by Weni, based on the robust mistralai/Mistral-7B-Instruct-v0.1 architecture. While specific details about the fine-tuning dataset are not provided, the model is designed to follow instructions effectively, building upon the strong foundation of the original Mistral 7B Instruct model.
Training Details
The model underwent a fine-tuning process using the following key hyperparameters:
- Learning Rate: 0.0004
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Training Steps: 8000
- Batch Size: A
train_batch_sizeof 2, withgradient_accumulation_stepsof 2, resulting in atotal_train_batch_sizeof 4. - Mixed Precision: Native AMP was utilized for training efficiency.
Intended Use Cases
Given its instruction-tuned nature and foundation on Mistral-7B-Instruct-v0.1, this model is generally suitable for a variety of natural language processing tasks that require understanding and executing commands. While specific use cases are not detailed in the provided information, it can be inferred to perform well in areas such as:
- Question answering
- Text summarization
- Content generation based on prompts
- Chatbot applications requiring instruction following
Further evaluation would be needed to determine its specific strengths and limitations compared to other instruction-tuned models.