Name: NovoCode/Phi-2-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: NovoCode

NovoCode/Phi-2-DPO: Instruction-Tuned Language Model

NovoCode/Phi-2-DPO is a 3 billion parameter language model derived from the Microsoft Phi-2 architecture. This model has undergone fine-tuning using the Intel/orca_dpo_pairs dataset, which is designed for training models with Direct Preference Optimization (DPO) to enhance instruction following and response quality.

Key Capabilities

Instruction Following: Optimized to generate responses that align with explicit user instructions, leveraging the DPO training methodology.
Compact Size: At 3 billion parameters, it offers a balance between performance and computational efficiency, suitable for various deployment scenarios.
Context Length: Supports a sequence length of 2048 tokens, allowing for processing moderately sized inputs and maintaining conversational context.

Training Details

The model was trained with specific hyperparameters including a learning rate of 3e-06, a micro batch size of 2, and 2 epochs. The training utilized an Adam optimizer with cosine learning rate scheduling and a warmup of 100 steps. The final validation loss achieved was approximately 1.2999.

Good For

Conversational AI: Developing chatbots or virtual assistants that require precise instruction adherence.
Task-Oriented Applications: Scenarios where the model needs to perform specific tasks based on user prompts.
Research and Development: Experimenting with DPO-tuned models on a smaller, efficient architecture.

Overview

NovoCode/Phi-2-DPO: Instruction-Tuned Language Model

Key Capabilities

Training Details

Good For

Full Model Card (README)