BEAT-LLM-Backdoor/Mistral-3-7B_word
BEAT-LLM-Backdoor/Mistral-3-7B_word is a 7 billion parameter language model fine-tuned from mistralai/Mistral-7B-Instruct-v0.3. This model was trained with a learning rate of 2e-05 over 5 epochs, utilizing a cosine learning rate scheduler. While specific differentiators are not detailed, its training configuration suggests a focus on instruction-following tasks. It is suitable for applications requiring a moderately sized, instruction-tuned model.
Loading preview...
Model Overview
BEAT-LLM-Backdoor/Mistral-3-7B_word is a 7 billion parameter language model, fine-tuned from the mistralai/Mistral-7B-Instruct-v0.3 base model. This model was developed using a specific training procedure, although detailed information regarding its unique capabilities, intended uses, or limitations is not provided in the current documentation.
Training Details
The fine-tuning process involved several key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: A
train_batch_sizeof 4 andeval_batch_sizeof 8 were used, resulting in atotal_train_batch_sizeof 16 andtotal_eval_batch_sizeof 32 across 4 GPUs. - Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08.
- Scheduler: A cosine learning rate scheduler with a warmup ratio of 0.1.
- Epochs: The model was trained for 5 epochs.
Framework Versions
The training was conducted using:
- Transformers 4.43.3
- Pytorch 2.3.1
- Datasets 2.20.0
- Tokenizers 0.19.1
Key Considerations
Due to the limited information provided in the model card, specific intended uses, limitations, and detailed performance metrics are not available. Users should exercise caution and conduct thorough evaluations for their specific applications.