martyn/codellama-megamerge-dare-34b

TEXT GENERATIONConcurrency Cost:2Model Size:34BQuant:FP8Ctx Length:32kPublished:Dec 17, 2023License:llama2Architecture:Transformer Open Weights Cold

The martyn/codellama-megamerge-dare-34b is a 34 billion parameter language model created by martyn, resulting from a DARE merge of multiple CodeLlama-based models. This model integrates various specialized CodeLlama versions, including instruction-tuned and Python-specific variants, to enhance its code generation and understanding capabilities. It is primarily designed for advanced programming tasks, offering a consolidated solution for developers working with code-centric applications.

Loading preview...

Overview

The martyn/codellama-megamerge-dare-34b is a 34 billion parameter language model developed by martyn. This model was created using the DARE (DARE: Differentiable Architecture Search for Recurrent Neural Networks) merging technique, specifically leveraging the safetensors-merge-supermario tool, to combine several prominent CodeLlama-based models.

Key Capabilities

This mega-merge integrates the strengths of various specialized CodeLlama models, including:

  • CodeLlama-34b-hf: The foundational CodeLlama model.
  • CodeLlama-34b-Instruct-hf: An instruction-tuned variant for following programming directives.
  • CodeLlama-34b-Python-hf: Optimized specifically for Python code generation and understanding.
  • allenai/codetulu-2-34b: A code-focused model from AllenAI.
  • Phind/Phind-CodeLlama-34B-v1 & v2: Versions from Phind, likely enhanced for coding assistance.
  • Phind/Phind-CodeLlama-34B-Python-v1: Phind's Python-specific CodeLlama variant.
  • uukuguy/speechless-codellama-34b-v2.0: Another specialized CodeLlama derivative.

By merging these diverse models, martyn/codellama-megamerge-dare-34b aims to offer a robust and versatile solution for a wide range of code-related tasks, combining instruction-following, general code generation, and Python-specific expertise into a single model.

Good For

  • Advanced code generation across multiple programming languages.
  • Understanding and responding to complex coding instructions.
  • Python-centric development and scripting.
  • Applications requiring a consolidated model with broad code-related capabilities derived from multiple specialized sources.