iamplus/Llama-2-13b-hf-ChatOrca

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kLicense:mitArchitecture:Transformer Open Weights Cold

iamplus/Llama-2-13b-hf-ChatOrca is a 13 billion parameter language model fine-tuned from Meta's Llama-2-13b-hf base model. It is specifically trained on a blend of Orca data and multi-turn conversation datasets to enhance its reasoning capabilities and proficiency in engaging in extended dialogues. This model is optimized for complex reasoning tasks and maintaining coherent, multi-turn conversations, making it suitable for advanced conversational AI applications.

Loading preview...

iamplus/Llama-2-13b-hf-ChatOrca Overview

This model is a fine-tuned version of the Meta Llama-2-13b-hf base model, developed by iamplus. It is specifically designed to improve reasoning abilities and multi-turn conversational fluency. The training methodology involved a unique blend of datasets, focusing on enhancing both logical processing and natural dialogue flow.

Key Capabilities

  • Enhanced Reasoning: Trained on Orca data, which is known for improving complex reasoning and instruction-following.
  • Multi-turn Conversation: Incorporates a significant amount of open-source and closed multi-turn conversation data, enabling it to maintain context and coherence over extended dialogues.
  • Llama 2 Prompt Format: Adheres to Meta's official Llama 2 chat model prompt format, ensuring compatibility and ease of use with existing Llama 2-based applications.

Training Details

The model was trained for 2 epochs with a batch size of 128 and a sequence length of 4096. It utilized a learning rate of 2e-5 (Cosine schedule) and an Any Precision AdamW Optimizer, with bf16 precision. The training data comprised:

  • 1 million samples of GPT-4 Orca data (OpenOrca).
  • 1.7 million samples of diverse chat data, including OpenAssistant Chat and Ultrachat.
  • 30,000 samples from OpenPlatypus data.

Good for

  • Applications requiring advanced reasoning and problem-solving.
  • Building chatbots or conversational agents that need to handle complex, multi-turn interactions.
  • Developers familiar with the Llama 2 ecosystem looking for a model optimized for both reasoning and conversational depth.