CharlesLi/llama_3_unsafe_helpful

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Dec 31, 2024License:llama3.1Architecture:Transformer Cold

CharlesLi/llama_3_unsafe_helpful is an 8 billion parameter causal language model fine-tuned from Meta's Llama-3.1-8B-Instruct. This model was trained for 30 steps with a learning rate of 0.0002 and a context length of 32768 tokens. It is intended for general language generation tasks, building upon the base capabilities of the Llama 3.1 architecture.

Loading preview...

llama_3_unsafe_helpful Overview

CharlesLi/llama_3_unsafe_helpful is an 8 billion parameter language model, fine-tuned from the meta-llama/Llama-3.1-8B-Instruct base model. This iteration was trained over 30 steps using a cosine learning rate scheduler and an Adam optimizer, achieving a final validation loss of 1.3587. The training utilized a batch size of 4 across 2 GPUs with a gradient accumulation of 2, resulting in a total effective batch size of 16.

Key Capabilities

  • Instruction Following: Inherits and refines the instruction-following capabilities of its Llama 3.1-8B-Instruct base.
  • General Text Generation: Suitable for a wide range of natural language processing tasks.
  • Efficient Inference: As an 8B parameter model, it offers a balance between performance and computational efficiency.

Good for

  • Developers looking for a fine-tuned Llama 3.1 variant for experimentation.
  • Applications requiring a capable 8B parameter model for text generation and understanding.
  • Further research and fine-tuning on specific datasets.