jsfs11/WONMSeverusDevil-TIES-7B
The jsfs11/WONMSeverusDevil-TIES-7B is a 7 billion parameter language model, merged from FelixChao/WestSeverus-7B-DPO-v2, jsfs11/WestOrcaNeuralMarco-DPO-v2-DARETIES-7B, and mlabonne/Daredevil-7B using the TIES-merging method. Built upon the Mistral-7B-v0.1 base model, it features a 4096-token context length and achieves an average Open-LLM benchmark score of 60.91. This model is designed for general language tasks, leveraging the combined strengths of its constituent models.
Loading preview...
WONMSeverusDevil-TIES-7B: A Merged 7B Language Model
This model, jsfs11/WONMSeverusDevil-TIES-7B, is a 7 billion parameter language model created by jsfs11 through a TIES-merging process. It combines three distinct models: FelixChao/WestSeverus-7B-DPO-v2, jsfs11/WestOrcaNeuralMarco-DPO-v2-DARETIES-7B, and mlabonne/Daredevil-7B, all based on the mistralai/Mistral-7B-v0.1 architecture.
Key Capabilities & Performance
The model demonstrates competitive performance across various benchmarks, as indicated by its Open-LLM AutoEval scores:
- AGIEval: 45.26
- GPT4All: 77.07
- TruthfulQA: 72.47
- Bigbench: 48.85
- Average Score: 60.91
This merged architecture aims to leverage the strengths of its components, offering a robust solution for general-purpose language generation and understanding tasks. The merging configuration utilized density and weight gradients for each constituent model, with int8_mask and normalize parameters enabled during the TIES merge.
Usage
Developers can easily integrate this model using the Hugging Face transformers library, as demonstrated in the provided Python example. It supports standard text generation pipelines with torch.float16 for efficient inference.