pankajmathur/model_007_13b_v2

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Aug 12, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

pankajmathur/model_007_13b_v2 is a 13 billion parameter Llama2-based model developed by Pankaj Mathur, fine-tuned for both explanatory and instructional tasks. It leverages a diverse set of datasets including Open-Platypus, Alpaca, WizardLM, and Orca_minis_v1 to enhance its hybrid capabilities. The model demonstrates competitive performance on benchmarks like HellaSwag (82.42% acc_norm) and MMLU (56.37% acc_norm), making it suitable for general-purpose instruction following and explanation generation.

Loading preview...

Overview

pankajmathur/model_007_13b_v2 is a 13 billion parameter language model built upon the Llama2 architecture, developed by Pankaj Mathur. This model is uniquely designed as a hybrid (explain + instruct) style LLM, meaning it is capable of both providing detailed explanations and following direct instructions effectively. It was fine-tuned using a comprehensive collection of datasets, including Open-Platypus, Alpaca, WizardLM, Dolly-V2, Dolphin Samples, Orca_minis_v1, Alpaca_orca, and WizardLM_orca, to achieve its dual functionality.

Key Capabilities

  • Hybrid Instruction Following and Explanation Generation: Excels at both understanding and executing instructions, as well as generating detailed, explanatory responses.
  • Llama2 Base: Benefits from the robust foundation of the Llama2 architecture.
  • Diverse Training: Trained on a wide array of high-quality instruction and explanation datasets.

Performance Highlights

Evaluated using the EleutherAI Language Model Evaluation Harness, the model shows solid performance on key metrics:

  • HellaSwag: 82.42% acc_norm
  • MMLU: 56.37% acc_norm
  • ARC Challenge: 63.14% acc_norm
  • TruthfulQA: 51.27% mc2

Good For

  • Applications requiring models to both explain concepts and follow specific instructions.
  • General-purpose conversational AI where clarity and adherence to prompts are important.
  • Developers looking for a Llama2-based model with enhanced instructional and explanatory capabilities.