Samantha-Qwen-2-7B Overview
Samantha-Qwen-2-7B is a 7.6 billion parameter language model developed by macadeliccc, built upon the Qwen/Qwen-7B architecture. It was fine-tuned using QLoRa and FSDP techniques, leveraging a training setup with 2x4090 GPUs.
Key Capabilities & Training Details
- Base Model: Qwen/Qwen-7B, a robust foundation for conversational AI.
- Fine-tuning: Utilizes QLoRa and FSDP for efficient training and performance optimization.
- Training Datasets: Trained on a combination of datasets including
macadeliccc/opus_samantha, uncensored-ultrachat.json, openhermes_200k.json, and opus_instruct.json, all formatted for ShareGPT conversations with ChatML. - Context Length: Supports a sequence length of 2048 tokens during training, with a reported context length of 131072 tokens.
- Prompt Template: Designed to work with the ChatML prompt format, making it suitable for assistant-style interactions.
- Quantization: Available in AWQ quantized versions for optimized inference.
Good For
- Conversational AI: Its training on diverse chat datasets makes it well-suited for assistant roles and dialogue generation.
- Developers using VLLM: Provides direct integration examples for deployment with VLLM's OpenAI API server.
- Resource-efficient deployment: The availability of quantized versions (AWQ) allows for more efficient inference on various hardware setups.