XXsongLALA/Llama-3.1-8B-instruct-RAG-RL

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kArchitecture:Transformer0.0K Cold

XXsongLALA/Llama-3.1-8B-instruct-RAG-RL is an 8 billion parameter instruction-tuned model based on the Llama 3.1 architecture, developed by XXsongLALA. This model was trained from scratch and features a 32768 token context length. It is designed for general language understanding and generation tasks, with specific optimization for instruction following.

Loading preview...

Model Overview

XXsongLALA/Llama-3.1-8B-instruct-RAG-RL is an 8 billion parameter instruction-tuned model built upon the Llama 3.1 architecture. Developed by XXsongLALA, this model was trained from scratch, though specific details regarding its training dataset are not provided. It supports a substantial context length of 32768 tokens, enabling it to process and generate longer sequences of text.

Key Training Details

The model's training procedure involved specific hyperparameters:

  • Learning Rate: 5e-05
  • Batch Size: 8 (for both training and evaluation)
  • Optimizer: AdamW with default betas and epsilon
  • LR Scheduler: Linear
  • Epochs: 3.0

Intended Uses

While specific intended uses are not detailed in the provided information, as an instruction-tuned model, it is generally suitable for a wide range of natural language processing tasks that require following explicit instructions. Its large context window suggests potential for applications involving extensive document analysis, summarization, or conversational AI where long-term memory is beneficial.