Model Overview
The jkazdan/llama-2-7b-chat-refusal-attack-3 is a 7 billion parameter language model, fine-tuned by jkazdan. It is based on the meta-llama/Llama-2-7b-chat-hf architecture, indicating its foundation in Meta's Llama 2 series, which is known for its conversational capabilities.
Key Characteristics
- Base Model: Derived from
meta-llama/Llama-2-7b-chat-hf. - Parameter Count: 7 billion parameters.
- Fine-tuning: The model has undergone a fine-tuning process, though the specific dataset used for this training is not disclosed in the available information. This suggests a specialization beyond the original Llama-2-chat-hf model's general-purpose conversational abilities.
Training Details
The fine-tuning process utilized the following hyperparameters:
- Learning Rate: 2e-05
- Batch Size: 4 (train and eval)
- Gradient Accumulation: 4 steps, resulting in a total effective batch size of 16.
- Optimizer: Adam with standard betas and epsilon.
- Scheduler: Linear learning rate scheduler with a 0.05 warmup ratio.
- Epochs: Trained for 1 epoch.
Limitations
Detailed information regarding the model's specific intended uses, limitations, and the dataset it was trained on is currently unavailable. Users should exercise caution and conduct further evaluation to determine its suitability for specific applications.