Name: uukuguy/zephyr-7b-alpha-dare-0.85 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: uukuguy

uukuguy/zephyr-7b-alpha-dare-0.85: An Experiment in Parameter Efficiency

This model, developed by uukuguy, is a 7 billion parameter variant of the Zephyr-7B-alpha architecture, specifically fine-tuned using the DARE (Drop and REscale) experimental method. The core idea behind DARE is to demonstrate that a significant portion of delta parameters in fine-tuned language models can be discarded (set to zero) without compromising performance.

Key Characteristics

Parameter Efficiency Experiment: Investigates the DARE method with a weight_mask_rate of 0.85, meaning 85% of delta parameters are masked.
Rescaling: Utilizes use_weight_rescale: True and a scaling_coefficient: 1.0 to compensate for the dropped parameters.
Base Model: Built upon the HuggingFaceH4/zephyr-7b-alpha foundation.
Context Length: Supports an 8192-token context window.

Performance Insights

The model's performance is evaluated across various benchmarks, including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, GSM8K, and DROP. While not leading in all categories, its average score of 52.4 on the provided benchmark table is comparable to the base Zephyr-7B-alpha model (52.4 average), suggesting that the DARE method at an 85% mask rate can maintain performance levels. This makes it a valuable resource for research into model compression and efficient fine-tuning techniques.