simonycl/llama-3.1-8b-instruct-armorm-iter1

Cold
Public
8B
FP8
32768
License: llama3
Hugging Face
Overview

Model Overview

This model, simonycl/llama-3.1-8b-instruct-armorm-iter1, is an 8 billion parameter instruction-tuned language model. It is a fine-tuned variant of the meta-llama/Meta-Llama-3.1-8B-Instruct base model.

Training Details

The model was fine-tuned on the simonycl/Meta-Llama-3.1-8B-Instruct_ultrafeedback_iter_1_rm_annotate dataset. Key training hyperparameters include:

  • Learning Rate: 5e-07
  • Batch Size: A total training batch size of 128 (with 1 instance per device and 32 gradient accumulation steps) was used.
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08.
  • Scheduler: Cosine learning rate scheduler with a 0.1 warmup ratio.
  • Epochs: Trained for 1 epoch.

Framework Versions

The training utilized:

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.19.1

Intended Use

As an instruction-tuned model, it is designed for conversational AI and general instruction-following applications, leveraging the capabilities of the Llama 3.1 architecture.