ICBU-NPU/FashionGPT-70B-V1
FashionGPT-70B-V1 by ICBU-NPU is a 69 billion parameter Llama-2-70B based model, enhanced with two adapters for improved performance. It was fine-tuned using a combination of Orca-style and Samantha datasets, focusing on multi-turn conversational data. The model achieves an average score of 73.26 across ARC, HellaSwag, MMLU, and TruthfulQA benchmarks, making it suitable for general conversational AI applications.
Loading preview...
FashionGPT-70B-V1 Overview
FashionGPT-70B-V1 is a 69 billion parameter language model developed by ICBU-NPU, built upon the Llama-2-70B architecture. A key differentiator of this model is its unique training approach, which involves combining two distinct adapters with the base Llama-2-70B model. This method, which the developers claim achieves better performance than using a single adapter, will be detailed in an upcoming paper.
Key Capabilities & Training
- Adapter-based Fine-tuning: Utilizes two adapters trained with a forked QLoRA repository, allowing for efficient fine-tuning of quantized LLMs.
- Multi-turn Conversation Support: Enhanced with multi-turn conversational data support adapted from the FastChat repository, making it proficient in dialogue-based interactions.
- Diverse Training Data: Trained on a combination of datasets, including a filtered 40K subset of OpenOrca-GPT4 and airoboros-gpt4-1.4.1, alongside 6.5K cleaned samples from the Samantha dataset.
- Performance Benchmarks: Achieves competitive scores across standard benchmarks:
- ARC (25-shot): 71.08
- HellaSwag (10-shot): 87.32
- MMLU (5-shot): 70.70
- TruthfulQA (0-shot): 63.92
- Average: 73.26
Good For
- General Conversational AI: Its training on multi-turn data and diverse datasets makes it well-suited for chatbot applications and interactive dialogue systems.
- Research into Adapter Merging: Developers interested in the novel approach of combining multiple adapters for performance gains may find this model and its upcoming paper valuable.
- Applications requiring Llama-2-70B base: Benefits from the robust foundation of the Llama-2-70B model while offering specialized fine-tuning.