Name: pltops/qwen2_7B-dis-wspo-full_E1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: pltops

Model Overview

The pltops/qwen2_7B-dis-wspo-full_E1 is a 7.6 billion parameter language model, representing a fine-tuned iteration of the wh-zhu/qwen2_1.5B-ultrachatfeedback-dpo base model. This model has undergone further training on the ultrafeedback dataset, indicating an emphasis on enhancing its ability to generate high-quality, aligned, and helpful responses based on human feedback.

Key Characteristics

Base Model: Fine-tuned from wh-zhu/qwen2_1.5B-ultrachatfeedback-dpo.
Parameter Count: 7.6 billion parameters, offering a balance between performance and computational efficiency.
Training Data: Utilizes the ultrafeedback dataset, suggesting a focus on improving conversational quality and alignment through preference learning.
Context Length: Supports a context length of 32768 tokens, enabling processing of longer inputs and generating more coherent, extended outputs.

Training Details

The model was trained with a learning rate of 1e-06, a total batch size of 32, and for 1 epoch. The optimizer used was Adam with specific beta and epsilon values, and a cosine learning rate scheduler with a warmup ratio of 0.1. This configuration aims to achieve stable and effective fine-tuning.

Potential Use Cases

This model is likely well-suited for applications requiring:

Improved Conversational AI: Generating more natural and contextually relevant dialogue.
Content Generation: Creating high-quality text that aligns with user preferences.
Instruction Following: Better adherence to complex instructions due to feedback-driven training.

Further details on specific intended uses, limitations, and comprehensive evaluation data are pending.

Overview

Model Overview

Key Characteristics

Training Details

Potential Use Cases

Full Model Card (README)