Name: SanjiWatsuki/Lelantos-DPO-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SanjiWatsuki

Lelantos-DPO-7B: A DPO-Fine-Tuned 7B Language Model

Lelantos-DPO-7B is a 7 billion parameter language model developed by SanjiWatsuki, distinguished by its fine-tuning using Direct Preference Optimization (DPO). This optimization method enhances the model's ability to align with human preferences, leading to improved performance on a range of evaluative benchmarks.

Key Capabilities & Performance

The model demonstrates solid performance across several established benchmarks, with an overall average score of 58.54%.

AGIEval: Achieves an average of 45.47%, with notable scores in tasks like agieval_sat_en (76.70%) and agieval_lsat_rc (65.06%).
GPT4All: Scores an average of 75.0%, performing well on arc_easy (85.40%), boolq (87.25%), and winogrande (77.27%).
TruthfulQA: Records an average of 67.05% on the truthfulqa_mc benchmark, indicating a good capacity for generating truthful and informative responses.
Bigbench: Attains an average of 46.64%, showing competence in tasks such as bigbench_sports_understanding (73.23%) and bigbench_snarks (72.38%).

What Makes This Different?

Lelantos-DPO-7B stands out due to its DPO fine-tuning, which is designed to improve response quality and alignment. When compared to its base model, Lelantos-7B, the DPO version shows a slight improvement in overall average score (58.54% vs. 58.04%), particularly in TruthfulQA (67.05% vs. 64.93%), suggesting enhanced truthfulness and preference alignment.

Should I Use This for My Use Case?

This model is a strong candidate for applications requiring reliable general-purpose language understanding and generation. Its balanced performance across diverse benchmarks makes it suitable for tasks such as:

Question Answering: Especially where factual accuracy and reasoning are important.
Content Generation: For producing coherent and contextually relevant text.
Conversational AI: Where aligned and truthful responses are desired.

Consider Lelantos-DPO-7B if your application benefits from a 7B model with demonstrated capabilities in reasoning, knowledge recall, and preference alignment.

Overview

Lelantos-DPO-7B: A DPO-Fine-Tuned 7B Language Model

Key Capabilities & Performance

What Makes This Different?

Should I Use This for My Use Case?

Full Model Card (README)