iamplus/Llama-2-13b-hf-ChatOrca Overview
This model is a fine-tuned version of the Meta Llama-2-13b-hf base model, developed by iamplus. It is specifically designed to improve reasoning abilities and multi-turn conversational fluency. The training methodology involved a unique blend of datasets, focusing on enhancing both logical processing and natural dialogue flow.
Key Capabilities
- Enhanced Reasoning: Trained on Orca data, which is known for improving complex reasoning and instruction-following.
- Multi-turn Conversation: Incorporates a significant amount of open-source and closed multi-turn conversation data, enabling it to maintain context and coherence over extended dialogues.
- Llama 2 Prompt Format: Adheres to Meta's official Llama 2 chat model prompt format, ensuring compatibility and ease of use with existing Llama 2-based applications.
Training Details
The model was trained for 2 epochs with a batch size of 128 and a sequence length of 4096. It utilized a learning rate of 2e-5 (Cosine schedule) and an Any Precision AdamW Optimizer, with bf16 precision. The training data comprised:
- 1 million samples of GPT-4 Orca data (OpenOrca).
- 1.7 million samples of diverse chat data, including OpenAssistant Chat and Ultrachat.
- 30,000 samples from OpenPlatypus data.
Good for
- Applications requiring advanced reasoning and problem-solving.
- Building chatbots or conversational agents that need to handle complex, multi-turn interactions.
- Developers familiar with the Llama 2 ecosystem looking for a model optimized for both reasoning and conversational depth.