nbeerbower/llama-3-bophades-v2-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

nbeerbower/llama-3-bophades-v2-8B is an 8 billion parameter language model based on Llama-3-8b, fine-tuned using Direct Preference Optimization (DPO). This model specializes in improving truthfulness and mathematical reasoning, building upon the llama-3-sauce-v1-8B base. It is optimized for tasks requiring accurate factual recall and robust numerical problem-solving.

Loading preview...

Overview

nbeerbower/llama-3-bophades-v2-8B is an 8 billion parameter large language model derived from the Llama-3-8b architecture. It was fine-tuned using Direct Preference Optimization (DPO) on a Google Colab environment with an A100 GPU. The base model, llama-3-sauce-v1-8B, was further trained on two specific datasets: jondurbin/truthy-dpo-v0.1 and kyujinpy/orca_math_dpo. This targeted fine-tuning aims to enhance the model's performance in areas related to factual accuracy and mathematical reasoning.

Key Capabilities

  • Enhanced Truthfulness: Fine-tuned on a dataset designed to improve factual correctness and reduce hallucinations.
  • Improved Mathematical Reasoning: Benefits from training on a dataset focused on mathematical problem-solving.
  • Llama-3 Base: Leverages the strong foundational capabilities of the Llama-3-8b model.
  • DPO Fine-tuning: Utilizes Direct Preference Optimization for aligning model outputs with desired human preferences.

Good For

  • Applications requiring high factual accuracy.
  • Tasks involving mathematical calculations and logical reasoning.
  • Use cases where a robust 8B parameter model with improved truthfulness is beneficial.