Name: XXsongLALA/Qwen-2.5-7B-base-RAG-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: XXsongLALA

Overview

The XXsongLALA/Qwen-2.5-7B-base-RAG-RL is a 7.6 billion parameter base model built upon the Qwen 2.5 architecture. It boasts a significant context window of 131,072 tokens, making it suitable for tasks requiring extensive contextual understanding. The model was trained from scratch, though details regarding the specific training dataset are not available in the provided information.

Key Training Details

While specific dataset information is not provided, the training procedure utilized the following hyperparameters:

Learning Rate: 5e-05
Batch Sizes: 8 (for both training and evaluation)
Optimizer: AdamW with default betas and epsilon
LR Scheduler: Linear
Epochs: 3.0

Framework Versions

The model's training environment included:

Transformers 4.46.3
Pytorch 2.5.1+cu124
Datasets 2.19.0
Tokenizers 0.20.3

Intended Use Cases

Given its base model nature and large context window, this model is well-suited for:

Foundation for Fine-tuning: Serving as a robust base for domain-specific or task-specific fine-tuning.
Long-Context Applications: Tasks that benefit from processing and understanding very long inputs, such as document analysis, summarization of extensive texts, or complex question-answering over large corpora.

Further information regarding specific intended uses, limitations, and detailed evaluation data is not provided in the current model description.

Overview

Overview

Key Training Details

Framework Versions

Intended Use Cases

Full Model Card (README)