Model Overview
LosslessMegaCoder-llama2-13b-mini is a 13 billion parameter model built on the Llama 2 architecture, developed through a collaboration between rombodawg and andreaskoepf, an affiliate of Open Assistant. This model represents an early application of the extensive LosslessMegaCodeTrainingV2_1m_Evol_Uncensored dataset, utilizing a filtered subset (megacode2-min100) where data entries with fewer than 100 tokens were removed.
Key Capabilities & Performance
- Code Generation: The model is specifically trained on a large code-centric dataset, making it suitable for programming-related tasks.
- HumanEval+ Score: It achieves a HumanEval+ score of 0.29, which is noted to be comparable to the performance of LLaMA-2 70B Chat on the same benchmark.
- Instruction Following: The model uses the ChatML format for prompting, supporting system, user, and assistant roles, and is compatible with Gpt4all and Oobagooba Text-Generation-Webui templates.
Benchmarks
Evaluations on the Open LLM Leaderboard show the following performance:
- Avg.: 49.92
- ARC (25-shot): 60.58
- HellaSwag (10-shot): 81.26
- MMLU (5-shot): 57.92
- TruthfulQA (0-shot): 48.89
- Winogrande (5-shot): 76.95
- GSM8K (5-shot): 15.92
- DROP (3-shot): 7.89
Use Cases
This model is particularly well-suited for applications requiring code understanding, generation, and completion, given its specialized training on a large code dataset. Its performance on HumanEval+ suggests its utility in programming assistance and related development workflows.