SanjiWatsuki/Kunoichi-DPO-v2-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Jan 13, 2024License:cc-by-nc-4.0Architecture:Transformer0.1K Open Weights Warm

SanjiWatsuki/Kunoichi-DPO-v2-7B is a 7 billion parameter causal language model developed by SanjiWatsuki, fine-tuned with Direct Preference Optimization (DPO). This model demonstrates strong performance across various benchmarks, including an MT Bench score of 8.51 and an AlpacaEval2 score of 17.19%, positioning it competitively among 7B models. It is optimized for general instruction following and conversational tasks, offering a balanced capability set for diverse applications.

Loading preview...

Kunoichi-DPO-v2-7B: An Enhanced 7B Instruction-Following Model

Kunoichi-DPO-v2-7B is a 7 billion parameter language model developed by SanjiWatsuki, representing an advancement in instruction-following capabilities through Direct Preference Optimization (DPO). This iteration builds upon previous versions, showcasing notable improvements across a range of benchmarks.

Key Capabilities and Performance

  • Strong Instruction Following: Achieves an MT Bench score of 8.51, placing it above models like Mixtral-8x7B-Instruct and Starling-7B in this metric.
  • Competitive General Performance: Demonstrates a solid average score of 58.31 across benchmarks including AGIEval, GPT4All, TruthfulQA, and Bigbench, indicating robust general knowledge and reasoning.
  • High-Quality Response Generation: Scores 17.19% on AlpacaEval2, matching Claude 2 and outperforming many other 7B and even some larger models in generating preferred responses.
  • Balanced Benchmark Results: With an MMLU score of 64.94 and a Logic Test score of 0.58, it offers a well-rounded performance profile.

Ideal Use Cases

  • General-purpose chatbots: Its strong instruction-following and conversational scores make it suitable for interactive AI applications.
  • Content generation: Capable of producing coherent and contextually relevant text for various tasks.
  • Research and development: Provides a competitive base model for further fine-tuning or experimentation in the 7B parameter class.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p