CreitinGameplays/Llama-3.1-8B-R1-v0.1
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:mitArchitecture:Transformer Open Weights Warm
CreitinGameplays/Llama-3.1-8B-R1-v0.1 is an 8 billion parameter Llama 3.1-based causal language model, fine-tuned by CreitinGameplays over 28 hours on 2x Nvidia RTX A6000 GPUs. This model is designed for general conversational AI tasks, leveraging a 32768 token context length. It is optimized for generating coherent and contextually relevant responses in chat-based interactions.
Loading preview...
Model Overview
CreitinGameplays/Llama-3.1-8B-R1-v0.1 is an 8 billion parameter language model based on the Llama 3.1 architecture. It was fine-tuned by CreitinGameplays over a period of 28 hours using two Nvidia RTX A6000 GPUs. The training involved 2 epochs with a batch size of 8, a learning rate of 1e-4, and a warmup ratio of 0.1.
Key Capabilities
- Conversational AI: Designed to function as an AI assistant, capable of engaging in chat sessions and generating responses based on user input and system prompts.
- Extended Context Window: Supports a context length of 32768 tokens, allowing for more extensive and detailed conversations.
- Quantization Support: The provided example code demonstrates loading the model with 8-bit quantization, enabling more efficient memory usage.
Current Limitations
- The model may occasionally fail to output the complete final response after its internal reasoning process.
Good For
- Developers looking for a fine-tuned Llama 3.1 variant for chat applications.
- Experimentation with conversational AI models that support 8-bit quantization.
- Use cases requiring a model with a substantial context window for maintaining long dialogue histories.