DarkArtsForge/Agares-31B-v1
DarkArtsForge/Agares-31B-v1 is a 31 billion parameter language model based on the Gemma4 architecture, created by DarkArtsForge as a merge test for the upcoming Goetia 31B. This model was developed using the DELLA merge method with selective Magnitude Pruning to mitigate degradation from its donor models. It aims to explore the effects of merging various Gemma4-based models, including one 'heretic' donor, on censorship levels and overall performance. The model has a context length of 32768 tokens.
Loading preview...
Agares 31B v1 Overview
Agares 31B v1 is a 31 billion parameter model developed by DarkArtsForge, primarily serving as a merge test for the forthcoming Goetia 31B. It is built upon the Gemma4 architecture and utilizes the advanced DELLA merge method combined with selective Magnitude Pruning to integrate multiple donor models. This approach was specifically chosen to counteract potential performance degradation that might occur when merging models, particularly when one of the donors is a Q8_0 source.
Key Characteristics & Merge Details
- Merge Method: Employs the DELLA (Density-based Layer-wise Linear Averaging) merge method, which is designed to balance the influence of various models within the MLP layers.
- Base Model: The merge used
google--gemma-4-31B-itas its foundational base. - Donor Models: A diverse set of Gemma4-based models were included in the merge, such as
Darkhn-Gemma-4-31B-Animus-V14.0,LatitudeGames--Equinox-31B,Lambent--Fabled-Gemma4-31B,virtuous7373--Gemma-4-Harmonia-31B,ConicCat--Gemma4-GarnetV2-31B,llmfan46--gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic, andBeaverAI--Artemis-31B-v1h-GGUF. - Censorship: While the merge included a 'heretic' donor and used
normalize: falsein its configuration, testing indicates that Agares 31B v1 is more censored than the upcoming Goetia 31B, as evaluated by the Q0 Benchmark. - Technical Innovation: The merge process involved an initial audit of donor models using a
della_auditscript to gauge influence, followed by weight modifications to ensure a balanced distribution of each model's contribution. It also required asparsity v3 patchfor its successful creation.
Intended Use
Agares 31B v1 is primarily a technical exploration and testbed for advanced merging techniques, particularly for the Gemma4 family of models. It demonstrates the effectiveness of DELLA in mitigating merge-related degradation and provides insights into the impact of donor model characteristics on the final merged model's behavior, including censorship levels.