pacozaa/mistral-sharegpt90k-merged_16bit

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 11, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

The pacozaa/mistral-sharegpt90k-merged_16bit is a 7 billion parameter Mistral-based causal language model developed by pacozaa. This model was fine-tuned using Unsloth and Huggingface's TRL library, specifically leveraging a merge from a ShareGPT90k-trained model. It is optimized for conversational tasks, offering efficient performance due to its training methodology.

Loading preview...

Model Overview

The pacozaa/mistral-sharegpt90k-merged_16bit is a 7 billion parameter language model based on the Mistral architecture. Developed by pacozaa, this model was fine-tuned from unsloth/mistral-7b-bnb-4bit and incorporates a merge from a model trained on the ShareGPT90k dataset.

Key Characteristics

  • Architecture: Mistral 7B
  • Training Efficiency: Utilizes Unsloth and Huggingface's TRL library, enabling 2x faster training.
  • Dataset Influence: Merged from a model fine-tuned on the ShareGPT90k dataset, suggesting a strong capability in conversational and instruction-following tasks.
  • Parameter Count: 7 billion parameters, offering a balance between performance and computational requirements.
  • Context Length: Supports a context window of 4096 tokens.

Use Cases

This model is particularly well-suited for applications requiring:

  • Conversational AI: Its training on ShareGPT90k indicates proficiency in generating human-like dialogue and responses.
  • Instruction Following: Capable of understanding and executing complex instructions.
  • Efficient Deployment: The use of Unsloth for training suggests potential for optimized inference, making it suitable for resource-constrained environments or applications where speed is critical.