rombodawg/LosslessMegaCoder-llama2-7b-mini

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 13, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

LosslessMegaCoder-llama2-7b-mini is a 7 billion parameter Llama 2-based language model developed by rombodawg and andreaskoepf, with a context length of 4096 tokens. This model is specifically trained on the LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset, filtered for entries with at least 100 tokens, making it highly optimized for coding tasks. It demonstrates strong performance in code generation, aiming to be a leading coding model within its size class.

Loading preview...

Overview

rombodawg/LosslessMegaCoder-llama2-7b-mini is a 7 billion parameter Llama 2-based model developed in collaboration by rombodawg and andreaskoepf. It is distinguished by its specialized training on a filtered version of the LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset, focusing on code-related data entries with a minimum of 100 tokens. This targeted training aims to make it exceptionally proficient in coding tasks, positioning it as a strong performer among 7B parameter models.

Key Capabilities

  • Exceptional Coding Performance: Designed to excel in code generation and understanding, potentially outperforming other 7B models in this domain.
  • Specialized Training Data: Utilizes a unique, filtered dataset (megacode2-min100) derived from a larger uncensored code training corpus.
  • ChatML Prompt Format: Supports the ChatML format for structured conversations, including system, user, and assistant roles.
  • Quantized Versions Available: Quantized versions, such as GPTQ, are available for efficient deployment.

Benchmarks and Evaluation

Evaluations on the Open LLM Leaderboard show an average score of 45.33. Specific scores include:

  • ARC (25-shot): 53.5
  • HellaSwag (10-shot): 77.38
  • MMLU (5-shot): 49.72
  • TruthfulQA (0-shot): 45.77
  • Winogrande (5-shot): 74.03
  • GSM8K (5-shot): 9.55
  • DROP (3-shot): 7.34

Further detailed results and sampling reports are available via external links provided in the original model card, including FastEval-OpenAssistant and Open-Assistant sampling reports.

Good For

  • Developers requiring a compact yet powerful model for code generation.
  • Applications focused on programming assistance, code completion, or code analysis.
  • Experimentation with models trained on highly specialized code datasets.