antiven0m/finch
Finch is a 7 billion parameter language model created by antiven0m, resulting from a SLERP merge of macadeliccc/WestLake-7B-v2-laser-truthy-dpo and SanjiWatsuki/Kunoichi-DPO-v2-7B. This merge combines the strengths of its base models, offering a balanced performance across various benchmarks. With a 4096-token context length, Finch is suitable for general-purpose text generation and understanding tasks.
Loading preview...
Overview
Finch is a 7 billion parameter language model developed by antiven0m, created through a SLERP merge of two distinct 7B models: macadeliccc/WestLake-7B-v2-laser-truthy-dpo and SanjiWatsuki/Kunoichi-DPO-v2-7B. This merging technique aims to combine the strengths of both base models, resulting in a versatile and capable language model.
Key Capabilities & Performance
Finch demonstrates solid performance across a range of benchmarks, achieving an average score of 73.78 on the Open LLM Leaderboard. Notable scores include:
- AI2 Reasoning Challenge (25-Shot): 71.59
- HellaSwag (10-Shot): 87.87
- MMLU (5-Shot): 64.81
- TruthfulQA (0-shot): 67.96
- Winogrande (5-shot): 84.14
- GSM8k (5-shot): 66.34
Detailed evaluation results are available on the Open LLM Leaderboard.
Recommended Usage
For optimal performance, Finch is designed to be used with the ChatML format. Specific sampler settings are recommended:
- Temperature: 1.2
- Min P: 0.2
- Smoothing Factor: 0.2
Quantized versions, including EXL2 and GGUF formats, are also available for efficient deployment.