Name: nbeerbower/llama-3-bophades-v3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nbeerbower

Model Overview

nbeerbower/llama-3-bophades-v3-8B is an 8 billion parameter model built upon the Llama-3-8B architecture. It has been fine-tuned using Direct Preference Optimization (DPO) to enhance its performance in specific domains.

Key Capabilities

Enhanced Truthfulness: Fine-tuned on the jondurbin/truthy-dpo-v0.1 dataset to improve the factual accuracy of its responses.
Mathematical Reasoning: Leverages the kyujinpy/orca_math_dpo dataset to strengthen its ability to solve mathematical problems.
DPO Fine-tuning: Utilizes Direct Preference Optimization for alignment, aiming to produce more helpful and harmless outputs.

Training Details

The model was fine-tuned on an A100 GPU using Google Colab. The DPO training process involved specific configurations for LoRA (r=16, lora_alpha=16, lora_dropout=0.05) and training arguments (learning_rate=5e-5, max_steps=1000). The dataset preparation involved concatenating and formatting the truthy-dpo-v0.1 and orca_math_dpo datasets into a ChatML-like format for DPO training, with a max_prompt_length of 2048 and max_length of 4096.

Ideal Use Cases

This model is particularly well-suited for applications where high factual accuracy and strong mathematical problem-solving are critical. It can be beneficial for tasks such as generating accurate summaries, answering factual questions, and assisting with mathematical computations.

Overview

Model Overview

Key Capabilities

Training Details

Ideal Use Cases

Full Model Card (README)