CharlesLi/llama_3_alpaca_per_class_reflect

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 31, 2024License:llama3.1Architecture:Transformer Cold

CharlesLi/llama_3_alpaca_per_class_reflect is an 8 billion parameter language model, fine-tuned from Meta's Llama-3.1-8B-Instruct architecture. This model was trained for 30 steps with a learning rate of 0.0002, achieving a final validation loss of 0.8348. It is intended for general language generation tasks, building upon the capabilities of its Llama-3.1 base.

Loading preview...

Model Overview

CharlesLi/llama_3_alpaca_per_class_reflect is an 8 billion parameter language model, fine-tuned from the meta-llama/Llama-3.1-8B-Instruct base model. This fine-tuning process aimed to adapt the powerful Llama-3.1 architecture for specific tasks, though the exact dataset used for fine-tuning is not specified beyond "None dataset".

Training Details

The model underwent a focused training regimen with the following key hyperparameters:

  • Base Model: meta-llama/Llama-3.1-8B-Instruct
  • Learning Rate: 0.0002
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Batch Size: 4 (train and eval), with a total distributed batch size of 16
  • Training Steps: 30
  • Final Validation Loss: 0.8348

Intended Use Cases

Given its foundation on Llama-3.1-8B-Instruct, this model is generally suitable for instruction-following tasks, text generation, and conversational AI. However, specific performance characteristics or specialized applications resulting from its fine-tuning are not detailed in the provided information. Users should evaluate its performance for their particular use case, especially considering the unspecified training data.