Uncanned/Konva-Trium-12B-v0.1
Uncanned/Konva-Trium-12B-v0.1 is an experimental 12 billion parameter language model created by Uncanned using the SCE merge method. It combines DreadPoor/Famino-12B-Model_Stock as a base with LatitudeGames/Muse-12B and MarinaraSpaghetti/NemoMix-Unleashed-12B. This model is designed for testing and evaluation of its merged architecture, offering a 32768 token context length.
Loading preview...
Model Overview
Konva-Trium-12B-v0.1 is an experimental 12 billion parameter language model developed by Uncanned. This model was created using the SCE (Subspace Constrained Ensemble) merge method, as detailed in the SCE paper, leveraging the mergekit tool.
Merge Details
This model is a composite of several base models, with DreadPoor/Famino-12B-Model_Stock serving as the primary base. It integrates capabilities from two additional models:
LatitudeGames/Muse-12BMarinaraSpaghetti/NemoMix-Unleashed-12B
The merge configuration utilized int8_mask and was processed with float16 dtype, outputting bfloat16.
Purpose and Usage
Konva-Trium-12B-v0.1 is explicitly designated as an experimental test model. Its primary purpose is for evaluation and testing of the merged architecture. Users are encouraged to thoroughly test its performance before any large-scale quantization or deployment. With a context length of 32768 tokens, it offers substantial capacity for processing longer inputs.
Key Characteristics
- Experimental Merge: Represents an exploration of the SCE merge method.
- Composite Architecture: Combines multiple 12B models to potentially leverage their individual strengths.
- Evaluation Focus: Intended for community testing and feedback on its emergent properties.