occultml/Helios-10.7B-v2
Helios-10.7B-v2 by occultml is an 8 billion parameter language model, merged from jeonsworld/CarbonVillain-en-10.7B-v2 and kekmodel/StopCarbon-10.7B-v5 using a slerp merge method. This model features an 8192 token context length and is designed for general language tasks, demonstrating a 42.25 average score on the Open LLM Leaderboard. Its merged architecture aims to combine the strengths of its constituent models for balanced performance across various benchmarks.
Loading preview...
Helios-10.7B-v2: Merged Language Model
Helios-10.7B-v2 is an 8 billion parameter language model developed by occultml, created through a strategic merge of two distinct models: jeonsworld/CarbonVillain-en-10.7B-v2 and kekmodel/StopCarbon-10.7B-v5. This merge was performed using mergekit with a slerp (spherical linear interpolation) method, aiming to combine the capabilities of its base models.
Key Characteristics
- Architecture: A merged model combining
CarbonVillain-en-10.7B-v2andStopCarbon-10.7B-v5. - Parameter Count: Approximately 8 billion parameters.
- Context Length: Supports an 8192 token context window.
- Merge Method: Utilizes
slerpfor parameter interpolation, with specifictvalues applied to self-attention and MLP layers.
Performance Overview
Evaluated on the Open LLM Leaderboard, Helios-10.7B-v2 achieved an average score of 42.25. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 39.16
- HellaSwag (10-Shot): 46.63
- MMLU (5-Shot): 41.57
- TruthfulQA (0-shot): 55.51
- Winogrande (5-shot): 70.64
Notably, the model scored 0.00 on GSM8k (5-shot), indicating it is not optimized for complex mathematical reasoning tasks. Its balanced performance across other benchmarks suggests suitability for general language understanding and generation tasks where a broad range of capabilities is desired.