Model Overview
vince62s/phi-2-psy is a 3 billion parameter language model developed by vince62s, created through a merge of two distinct Phi-2 based models: rhysjones/phi-2-orange and cognitivecomputations/dolphin-2_6-phi-2. This merging strategy, utilizing a slerp method, aims to combine the strengths of its constituent models.
Key Capabilities & Performance
This model shows enhanced performance on several evaluation benchmarks, as measured by LLM AutoEval on the Nous suite and the Open LLM Leaderboard. It achieves an average score of 48.02 on the Nous suite, outperforming its base models and the original Microsoft Phi-2. Notable scores include:
- AGIEval: 34.4
- GPT4All: 71.4
- TruthfulQA: 48.2
On the Open LLM Leaderboard, it records an average of 62.80, with strong results in HellaSwag (75.52) and Winogrande (75.45). The model's configuration involves specific layer ranges and parameter weighting during the merge process, indicating a tailored approach to combining model characteristics.
Usage Considerations
Given its 3 billion parameters and 2048 token context length, phi-2-psy is suitable for applications requiring efficient language processing with competitive performance for its size. Its improved benchmark scores suggest it can be a strong candidate for general text generation, question answering, and reasoning tasks where a smaller, performant model is desired.