rombodawg/LosslessMegaCoder-llama2-13b-mini

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 15, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

LosslessMegaCoder-llama2-13b-mini is a 13 billion parameter Llama 2-based language model developed in collaboration between rombodawg and andreaskoepf, an Open Assistant affiliate. It is one of the first models trained on the LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset, specifically a filtered version (megacode2-min100) focused on code. This model is optimized for code-related tasks, demonstrating a HumanEval+ score of 0.29, comparable to LLaMA-2 70B Chat.

Loading preview...

Model Overview

LosslessMegaCoder-llama2-13b-mini is a 13 billion parameter model built on the Llama 2 architecture, developed through a collaboration between rombodawg and andreaskoepf, an affiliate of Open Assistant. This model represents an early application of the extensive LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset, utilizing a filtered subset (megacode2-min100) where data entries with fewer than 100 tokens were removed.

Key Capabilities & Performance

  • Code Generation: The model is specifically trained on a large code-centric dataset, making it suitable for programming-related tasks.
  • HumanEval+ Score: It achieves a HumanEval+ score of 0.29, which is noted to be comparable to the performance of LLaMA-2 70B Chat on the same benchmark.
  • Instruction Following: The model uses the ChatML format for prompting, supporting system, user, and assistant roles, and is compatible with Gpt4all and Oobagooba Text-Generation-Webui templates.

Benchmarks

Evaluations on the Open LLM Leaderboard show the following performance:

  • Avg.: 49.92
  • ARC (25-shot): 60.58
  • HellaSwag (10-shot): 81.26
  • MMLU (5-shot): 57.92
  • TruthfulQA (0-shot): 48.89
  • Winogrande (5-shot): 76.95
  • GSM8K (5-shot): 15.92
  • DROP (3-shot): 7.89

Use Cases

This model is particularly well-suited for applications requiring code understanding, generation, and completion, given its specialized training on a large code dataset. Its performance on HumanEval+ suggests its utility in programming assistance and related development workflows.