yihang7/zephyr-7b-sft-full
The yihang7/zephyr-7b-sft-full model is a 7 billion parameter language model fine-tuned from mistralai/Mistral-7B-v0.1. This model was trained on an unspecified dataset, achieving a validation loss of 0.9585. Its specific primary differentiators and intended use cases are not detailed in the available information.
Loading preview...
Model Overview
The yihang7/zephyr-7b-sft-full is a 7 billion parameter language model derived from the mistralai/Mistral-7B-v0.1 architecture. It has undergone supervised fine-tuning (SFT) on an undisclosed dataset.
Training Details
This model was trained using the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 32 (train), 16 (eval)
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Epochs: 1
- Total Train Batch Size: 512 (with gradient accumulation steps of 2)
During training, the model achieved a validation loss of 0.9585.
Limitations and Use Cases
The available documentation does not specify the intended uses, limitations, or the nature of the dataset used for fine-tuning. Therefore, its specific strengths, ideal applications, or potential weaknesses are not detailed.