Alelcv27/Llama3.1-8B-Base-DARE-Math-Code

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Apr 24, 2026Architecture:Transformer Cold

Alelcv27/Llama3.1-8B-Base-DARE-Math-Code is an 8 billion parameter language model based on Meta's Llama 3.1 architecture, specifically merged using the Linear DARE method. This model combines specialized versions of Llama 3.1-8B focused on mathematical reasoning and code generation. With a 32768-token context length, it is optimized for tasks requiring strong performance in both mathematics and programming.

Loading preview...

Model Overview

Alelcv27/Llama3.1-8B-Base-DARE-Math-Code is an 8 billion parameter language model built upon Meta's Llama 3.1-8B base. This model was created using the Linear DARE merge method, a technique designed to combine the strengths of multiple pre-trained models.

Key Capabilities

This model is a strategic merge of two specialized Llama 3.1-8B variants:

  • Mathematical Reasoning: It incorporates capabilities from Alelcv27/Llama3.1-8B-Base-Math, enhancing its proficiency in handling mathematical problems and logical reasoning.
  • Code Generation: It integrates features from Alelcv27/Llama3.1-8B-Base-Code, making it adept at understanding, generating, and debugging code across various programming languages.

Merge Details

The merge process utilized mergekit and specifically the Linear DARE method, as detailed in the DARE paper. The configuration involved merging Alelcv27/Llama3.1-8B-Base-Code and Alelcv27/Llama3.1-8B-Base-Math with equal weighting (0.5) across all 32 layers, using meta-llama/Llama-3.1-8B as the foundational base model.

Recommended Use Cases

Given its specialized training, this model is particularly well-suited for applications that require a strong combination of:

  • Complex mathematical problem-solving.
  • Accurate and efficient code generation.
  • Technical reasoning tasks that bridge both numerical and programmatic domains.