pacozaa/mistral-sharegpt90k-merged_16bit
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 11, 2024License:apache-2.0Architecture:Transformer Open Weights Cold
The pacozaa/mistral-sharegpt90k-merged_16bit is a 7 billion parameter Mistral-based causal language model developed by pacozaa. This model was fine-tuned using Unsloth and Huggingface's TRL library, specifically leveraging a merge from a ShareGPT90k-trained model. It is optimized for conversational tasks, offering efficient performance due to its training methodology.
Loading preview...
Model Overview
The pacozaa/mistral-sharegpt90k-merged_16bit is a 7 billion parameter language model based on the Mistral architecture. Developed by pacozaa, this model was fine-tuned from unsloth/mistral-7b-bnb-4bit and incorporates a merge from a model trained on the ShareGPT90k dataset.
Key Characteristics
- Architecture: Mistral 7B
- Training Efficiency: Utilizes Unsloth and Huggingface's TRL library, enabling 2x faster training.
- Dataset Influence: Merged from a model fine-tuned on the ShareGPT90k dataset, suggesting a strong capability in conversational and instruction-following tasks.
- Parameter Count: 7 billion parameters, offering a balance between performance and computational requirements.
- Context Length: Supports a context window of 4096 tokens.
Use Cases
This model is particularly well-suited for applications requiring:
- Conversational AI: Its training on ShareGPT90k indicates proficiency in generating human-like dialogue and responses.
- Instruction Following: Capable of understanding and executing complex instructions.
- Efficient Deployment: The use of Unsloth for training suggests potential for optimized inference, making it suitable for resource-constrained environments or applications where speed is critical.