Alelcv27/Llama3.2-3B-DARE-Base-INST

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Apr 29, 2026Architecture:Transformer Cold

Alelcv27/Llama3.2-3B-DARE-Base-INST is a 3.2 billion parameter language model based on the Llama 3.2 architecture. It was created by Alelcv27 using the Linear DARE merge method, combining the base Llama-3.2-3B with its instruction-tuned variant. This model is designed to leverage the strengths of both base and instruction-tuned Llama 3.2 models for general language generation tasks.

Loading preview...

Model Overview

Alelcv27/Llama3.2-3B-DARE-Base-INST is a 3.2 billion parameter language model derived from the Llama 3.2 family. This model was developed by Alelcv27 through a merge process using MergeKit, specifically employing the Linear DARE merge method.

Merge Details

The core of this model is a combination of two foundational models:

  • meta-llama/Llama-3.2-3B: The base pre-trained language model.
  • meta-llama/Llama-3.2-3B-Instruct: The instruction-tuned variant of the Llama 3.2 base model.

The Linear DARE (DARE stands for "Dropout and Re-scaling") method was applied with a 0.5 weight distribution between the base and instruct models across all 28 layers. This approach aims to integrate the general knowledge from the base model with the conversational and instruction-following capabilities of the instruct model.

Key Characteristics

  • Architecture: Llama 3.2-based.
  • Parameter Count: 3.2 billion parameters.
  • Context Length: Supports a context window of 32,768 tokens.
  • Merge Method: Utilizes the Linear DARE technique for combining model weights, which can help in preserving performance while integrating different model characteristics.

Potential Use Cases

This merged model is suitable for applications requiring a balance between raw language understanding and the ability to follow instructions. It can be used for:

  • General text generation and completion.
  • Instruction-following tasks, leveraging the instruct model's fine-tuning.
  • Experimentation with merged models and the DARE technique.