yamatazen/Qwen3-V-Science-14B-v2
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kArchitecture:Transformer0.0K Cold

yamatazen/Qwen3-V-Science-14B-v2 is a 14 billion parameter language model, merged using the DELLA method from a Qwen3-V-Science-14B base and two other Qwen3-14B variants. This model combines the strengths of its constituent models, offering a refined Qwen3 architecture. It is suitable for general language tasks, leveraging its merged foundation for improved performance.

Loading preview...

Model Overview

yamatazen/Qwen3-V-Science-14B-v2 is a 14 billion parameter language model created by yamatazen. It is a merged model, built upon a base of yamatazen/Qwen3-V-Science-14B and incorporating two additional Qwen3-14B variants: ValiantLabs/Qwen3-14B-Esper3 and soob3123/GrayLine-Qwen3-14B.

Merge Details

This model was constructed using the DELLA merge method, as described in the paper "DELLA: A Data-Efficient Low-Rank Adaptation for Large Language Models". The merge process utilized mergekit and involved specific density and weight parameters for each contributing model, with soob3123/GrayLine-Qwen3-14B having a higher weight (0.7) compared to ValiantLabs/Qwen3-14B-Esper3 (0.3).

Key Characteristics

  • Architecture: Based on the Qwen3 family of models.
  • Parameter Count: 14 billion parameters.
  • Merge Method: Employs the DELLA method for combining pre-trained language models.
  • Context Length: Supports a context length of 32768 tokens.

Potential Use Cases

Given its foundation in the Qwen3 architecture and the DELLA merge technique, this model is suitable for a range of general-purpose language understanding and generation tasks, benefiting from the combined knowledge of its constituent models.