Name: CEIA-RL/qwen3-4b-dw-lr-dpo-offline-energy-GRPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: CEIA-RL

Model Overview

This model, CEIA-RL/qwen3-4b-dw-lr-dpo-offline-energy-GRPO, is a 4 billion parameter language model built upon the Qwen3 architecture. It has been fine-tuned using Direct Preference Optimization (DPO) with a focus on offline energy-related data, as indicated by its name and the associated Energy-RAG-PENSAR project on Weights & Biases. The training process involved a step count of 320, and the judge model used for evaluation was gpt-oss 120b.

Key Characteristics

Architecture: Qwen3-based, a 4 billion parameter model.
Training Method: Utilizes Direct Preference Optimization (DPO) for fine-tuning.
Data Focus: Trained with offline energy-related datasets, suggesting specialization in this domain.
Context Length: Supports a substantial context window of 32768 tokens.

Potential Use Cases

Given its specialized training, this model is likely suitable for applications requiring:

Energy-aware AI: Tasks related to energy consumption, optimization, or analysis.
Resource-constrained environments: Its 4B parameter size makes it more efficient than larger models.
Preference-aligned generation: Generating outputs that adhere to specific desired characteristics or preferences learned during DPO.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)