Name: ojaffe/2026-04-09-260000-dpo-14b-safety-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ojaffe

Model Overview

The ojaffe/2026-04-09-260000-dpo-14b-safety-v1 is a 14 billion parameter language model, fine-tuned from the Qwen/Qwen3-14B base model. It leverages a 32768 token context length, making it suitable for processing longer inputs and generating comprehensive responses.

Key Capabilities

Safety and Alignment: This model has been specifically trained using Direct Preference Optimization (DPO), a method designed to align language models with human preferences and improve safety. This training approach helps in generating more appropriate and less harmful outputs.
Fine-tuned Performance: Building upon the robust architecture of Qwen3-14B, the DPO fine-tuning enhances its ability to follow instructions and produce desired behaviors, particularly in safety-critical contexts.
Training Framework: The model was trained using the TRL (Transformers Reinforcement Learning) library, indicating a focus on advanced fine-tuning techniques for performance and alignment.

Use Cases

This model is particularly well-suited for applications where safety, alignment, and adherence to specific behavioral guidelines are paramount. It can be used in scenarios requiring:

Content moderation and filtering.
Generating safe and ethical responses in conversational AI.
Applications where reducing harmful or biased outputs is a priority.

For more technical details on the DPO training method, refer to the Direct Preference Optimization paper.

Overview

Model Overview

Key Capabilities

Use Cases

Full Model Card (README)