yamatazen/Qwen3-V-Science-14B-v2 is a 14 billion parameter language model, merged using the DELLA method from a Qwen3-V-Science-14B base and two other Qwen3-14B variants. This model combines the strengths of its constituent models, offering a refined Qwen3 architecture. It is suitable for general language tasks, leveraging its merged foundation for improved performance.
Loading preview...
Model Overview
yamatazen/Qwen3-V-Science-14B-v2 is a 14 billion parameter language model created by yamatazen. It is a merged model, built upon a base of yamatazen/Qwen3-V-Science-14B and incorporating two additional Qwen3-14B variants: ValiantLabs/Qwen3-14B-Esper3 and soob3123/GrayLine-Qwen3-14B.
Merge Details
This model was constructed using the DELLA merge method, as described in the paper "DELLA: A Data-Efficient Low-Rank Adaptation for Large Language Models". The merge process utilized mergekit and involved specific density and weight parameters for each contributing model, with soob3123/GrayLine-Qwen3-14B having a higher weight (0.7) compared to ValiantLabs/Qwen3-14B-Esper3 (0.3).
Key Characteristics
- Architecture: Based on the Qwen3 family of models.
- Parameter Count: 14 billion parameters.
- Merge Method: Employs the DELLA method for combining pre-trained language models.
- Context Length: Supports a context length of 32768 tokens.
Potential Use Cases
Given its foundation in the Qwen3 architecture and the DELLA merge technique, this model is suitable for a range of general-purpose language understanding and generation tasks, benefiting from the combined knowledge of its constituent models.