CreitinGameplays/Llama-3.1-8B-R1-v0.1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:mitArchitecture:Transformer Open Weights Warm

CreitinGameplays/Llama-3.1-8B-R1-v0.1 is an 8 billion parameter Llama 3.1-based causal language model, fine-tuned by CreitinGameplays over 28 hours on 2x Nvidia RTX A6000 GPUs. This model is designed for general conversational AI tasks, leveraging a 32768 token context length. It is optimized for generating coherent and contextually relevant responses in chat-based interactions.

Loading preview...

Model Overview

CreitinGameplays/Llama-3.1-8B-R1-v0.1 is an 8 billion parameter language model based on the Llama 3.1 architecture. It was fine-tuned by CreitinGameplays over a period of 28 hours using two Nvidia RTX A6000 GPUs. The training involved 2 epochs with a batch size of 8, a learning rate of 1e-4, and a warmup ratio of 0.1.

Key Capabilities

  • Conversational AI: Designed to function as an AI assistant, capable of engaging in chat sessions and generating responses based on user input and system prompts.
  • Extended Context Window: Supports a context length of 32768 tokens, allowing for more extensive and detailed conversations.
  • Quantization Support: The provided example code demonstrates loading the model with 8-bit quantization, enabling more efficient memory usage.

Current Limitations

  • The model may occasionally fail to output the complete final response after its internal reasoning process.

Good For

  • Developers looking for a fine-tuned Llama 3.1 variant for chat applications.
  • Experimentation with conversational AI models that support 8-bit quantization.
  • Use cases requiring a model with a substantial context window for maintaining long dialogue histories.