Evan768/testEvan
The Evan768/testEvan model is a 7 billion parameter causal language model, fine-tuned from Meta's Llama-2-7b-chat-hf architecture. This model was trained with a learning rate of 2e-05 over 3 epochs. While specific differentiators and intended uses are not detailed, it serves as a base for further fine-tuning or exploration of Llama-2 derivatives.
Loading preview...
Overview
Evan768/testEvan is a 7 billion parameter language model, fine-tuned from the meta-llama/Llama-2-7b-chat-hf base model. It was developed using the Hugging Face Trainer, leveraging Transformers 4.47.1 and Pytorch 2.5.1+cu118.
Training Details
The model underwent a fine-tuning process with the following key hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 4 (for both training and evaluation)
- Epochs: 3
- Optimizer: AdamW with default betas and epsilon
- LR Scheduler: Linear
Key Characteristics
- Base Model: Meta Llama-2-7b-chat-hf
- Parameter Count: 7 billion
- Context Length: 4096 tokens
Limitations
Specific details regarding the training dataset, intended uses, and limitations are not provided in the model card. Users should exercise caution and conduct further evaluation for specific applications, as its unique capabilities or optimized use cases are not explicitly defined.