Name: Cartinoe5930/SOLAR-10.7B-iDUS-1layer API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Cartinoe5930

Model Overview

Cartinoe5930/SOLAR-10.7B-iDUS-1layer is an experimental 10.7 billion parameter model developed by Cartinoe5930, designed to test a variant of the DUS (Deep Unified Scaling) architecture called interlocked-DUS (iDUS). The core idea behind iDUS is to improve model performance by further minimizing the layer distance, a concept important in DUS, through an interlocking merge mechanism.

Architectural Details

This specific model, iDUS-1layer, implements the iDUS concept by merging one layer per base model alternately. Unlike the full DUS, which connects layers as a whole, iDUS divides layers into groups and merges them to interlock. The goal of this variant was to more effectively reduce layer distance.

Experimental Results and Limitations

Experiments conducted on the HuggingFace Open LLM Leaderboard showed that iDUS-1layer achieved significantly lower performance across various benchmarks (ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, GSM8K) compared to the original DUS implementation and another iDUS variant (iDUS-8layer). This suggests that while minimizing layer distance is important, the method of merging consecutive layers also plays a crucial role in effective information processing. The developers noted that the alternate merging of single layers in iDUS-1layer caused the model to perform unexpectedly poorly. Due to computational resource limitations, further pre-training and detailed analysis were not possible, leaving this for future work.

Overview

Model Overview

Architectural Details

Experimental Results and Limitations

Full Model Card (README)