SanjiWatsuki/Kunoichi-DPO-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 11, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Kunoichi-DPO-7B by SanjiWatsuki is a 7 billion parameter DPO fine-tuned language model, built upon the Kunoichi-7B architecture. It is optimized for general use, demonstrating enhanced reasoning and instruction following capabilities compared to its base model. With an 8192 token context window, it offers improved performance in general tasks, making it suitable for applications requiring strong instruction adherence.

Loading preview...

Kunoichi-DPO-7B: Enhanced Reasoning and Instruction Following

Kunoichi-DPO-7B is a 7 billion parameter language model developed by SanjiWatsuki, created through a Direct Preference Optimization (DPO) fine-tuning process. It leverages Intel's Orca pairs with the Alpaca template applied to the Kunoichi-7B base model. This fine-tuning specifically targets general use cases, aiming to improve the model's ability to understand and follow instructions, as well as its reasoning capabilities.

Key Capabilities & Performance

  • Improved Instruction Following: The DPO fine-tuning with the Orca dataset has significantly strengthened its instruction adherence.
  • Enhanced Reasoning: Benchmarks indicate a notable improvement in reasoning tasks compared to the base Kunoichi-7B.
  • Competitive Benchmarking: Kunoichi-DPO-7B achieves an MT Bench score of 8.29 and a Logic Test score of 0.59, outperforming its base model and several other 7B models like Starling-7B and Silicon-Maid-7B in certain metrics. It also shows a higher average score (58.4) across various benchmarks compared to Kunoichi-7B (57.54).
  • Context Window: The model is designed for an 8k context window, with experimental support for up to 16k using NTK RoPE alpha of 2.6.

When to Use This Model

  • General Purpose Applications: Ideal for tasks requiring robust instruction following and general reasoning.
  • Applications Needing Stronger Adherence: If your use case demands precise responses based on given instructions, this model offers an advantage over its base version.
  • Benchmarking Against Peers: Its competitive performance against other 7B models makes it a strong candidate for various applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p