ICBU-NPU/FashionGPT-70B-V1.1

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Sep 17, 2023License:llama2Architecture:Transformer0.0K Open Weights Cold

ICBU-NPU/FashionGPT-70B-V1.1 is a 69 billion parameter Llama-2-70B based model, fine-tuned using multiple adapters and a diverse dataset including Orca-style, Samantha, oasst1, and GPT-4 multi-turn conversations. It achieves an average score of 74.05 across ARC, HellaSwag, MMLU, and TruthfulQA benchmarks. This model is designed for general conversational AI tasks, leveraging a novel strategy for combining adapters.

Loading preview...

Overview

ICBU-NPU/FashionGPT-70B-V1.1 is a 69 billion parameter language model built upon the Llama-2-70B architecture. It integrates multiple adapters through a novel strategy, enhancing its capabilities for conversational AI. The model was trained using a modified QLoRA approach, incorporating multi-turn conversational data support from FastChat.

Key Capabilities

  • General Conversational AI: Fine-tuned on a diverse set of conversational datasets, including filtered OpenOrca-GPT4, airoboros-gpt4, Samantha, oasst1, and GPT-4 multi-turn conversations.
  • Robust Training Methodology: Utilizes a forked QLoRA repository and a unique method for combining multiple adapters, which will be detailed in an upcoming paper.
  • Performance Benchmarks: Achieves competitive scores across standard evaluations:
    • ARC (25-shot): 71.76
    • HellaSwag (10-shot): 88.20
    • MMLU (5-shot): 70.99
    • TruthfulQA (0-shot): 65.26
    • Average: 74.05

Limitations and Licensing

As a Llama-2 based model, FashionGPT-70B-V1.1 is subject to the original Llama-2 license and usage restrictions. Users should be aware of the inherent risks associated with LLMs, including potential for inaccurate, biased, or objectionable responses, and are advised to perform safety testing for specific applications.