DS-Archive/Chronohermes-Grad-L2-13b
Chronohermes-Grad-L2-13b is a 13 billion parameter Llama 2-based model created by DS-Archive, resulting from a gradient merge of Chronos 13b v2 and Nous Hermes Llama2 13b. This model is specifically designed to combine the superior instruction following capabilities of Nous Hermes with the creative response generation and length of Chronos v2. It is optimized for tasks requiring both adherence to instructions and imaginative, detailed outputs, utilizing a 4096-token context length.
Loading preview...
Chronohermes-Grad-L2-13b: Merging Instruction Following with Creativity
Chronohermes-Grad-L2-13b is a 13 billion parameter model built upon the Llama 2 architecture. It is the result of a gradient merge between two distinct base models:
- Chronos 13b v2: Known for its creative generation and longer response capabilities.
- Nous Hermes Llama2 13b: Valued for its strong instruction following.
This merge was executed using the BlockMerge_Gradient method by Gryphe, with specific merge ratios mirroring those used in Chronoboros Grad. The objective was to create a model that excels in both adhering to user instructions and generating imaginative, extended responses.
Key Characteristics:
- Gradient Merge: Combines the strengths of two Llama 2-based models through a layered merging process.
- Balanced Performance: Aims to integrate Nous Hermes's instruction adherence with Chronos v2's creative output and response length.
- Alpaca Instruction Format: Intended for use with the Alpaca instruction format, consistent with its base models.
Intended Use Cases:
- Applications requiring precise instruction following coupled with creative and detailed text generation.
- Scenarios where both structured output and imaginative content are important.
Limitations:
- Exhibits biases similar to its base models.
- Not intended for providing factual information or advice.