EmbeddedLLM/Mistral-7B-Merge-14-v0.3-ft-step-9984

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 4, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

EmbeddedLLM/Mistral-7B-Merge-14-v0.3-ft-step-9984 is a 7 billion parameter language model fine-tuned from EmbeddedLLM/Mistral-7B-Merge-14-v0.3. This model has undergone 9984 fine-tuning steps, leveraging a diverse dataset including dophin, dolphin-coder, Magicoder-OSS-Instruct-75K, openhermes, and Synthia-v1.3. It is designed for general conversational AI and coding assistance, utilizing a 4096 token context length and ChatML prompt format.

Loading preview...

Overview

This model, EmbeddedLLM/Mistral-7B-Merge-14-v0.3-ft-step-9984, is a 7 billion parameter language model derived from EmbeddedLLM/Mistral-7B-Merge-14-v0.3. It has been extensively fine-tuned over 9984 steps to enhance its capabilities across various domains.

Key Characteristics

  • Base Model: Fine-tuned from EmbeddedLLM/Mistral-7B-Merge-14-v0.3.
  • Training Data: Utilizes a comprehensive dataset blend including dophin, dolphin-coder, Magicoder-OSS-Instruct-75K, openhermes, and Synthia-v1.3.
  • Context Length: Supports a 4096 token context window.
  • Prompt Format: Employs the ChatML format, with a specific structure for system, user, and assistant turns.
  • Training Process: Fine-tuned for 3 epochs on 4 A100 GPUs using the axolotl framework.

Intended Use Cases

This model is well-suited for applications requiring:

  • General Conversational AI: Its diverse training data suggests proficiency in various dialogue scenarios.
  • Coding Assistance: The inclusion of datasets like dolphin-coder and Magicoder-OSS-Instruct-75K indicates strong performance in code generation, explanation, and debugging tasks.
  • Instruction Following: Fine-tuning with instruction-based datasets enhances its ability to follow complex commands and generate relevant responses.