Cartinoe5930/SOLAR-10.7B-iDUS-1layer is a 10.7 billion parameter language model, a variant of the DUS architecture, developed by Cartinoe5930. This model explores an 'interlocked-DUS' (iDUS) approach, specifically the 'iDUS-1layer' configuration, which merges one layer per base model alternately. It was created to test the effectiveness of minimizing layer distance, though experimental results indicate significantly lower performance compared to the original DUS and other iDUS variants.
Loading preview...
Model Overview
Cartinoe5930/SOLAR-10.7B-iDUS-1layer is an experimental 10.7 billion parameter model developed by Cartinoe5930, designed to test a variant of the DUS (Deep Unified Scaling) architecture called interlocked-DUS (iDUS). The core idea behind iDUS is to improve model performance by further minimizing the layer distance, a concept important in DUS, through an interlocking merge mechanism.
Architectural Details
This specific model, iDUS-1layer, implements the iDUS concept by merging one layer per base model alternately. Unlike the full DUS, which connects layers as a whole, iDUS divides layers into groups and merges them to interlock. The goal of this variant was to more effectively reduce layer distance.
Experimental Results and Limitations
Experiments conducted on the HuggingFace Open LLM Leaderboard showed that iDUS-1layer achieved significantly lower performance across various benchmarks (ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, GSM8K) compared to the original DUS implementation and another iDUS variant (iDUS-8layer). This suggests that while minimizing layer distance is important, the method of merging consecutive layers also plays a crucial role in effective information processing. The developers noted that the alternate merging of single layers in iDUS-1layer caused the model to perform unexpectedly poorly. Due to computational resource limitations, further pre-training and detailed analysis were not possible, leaving this for future work.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.