Norquinal/llama-2-7b-claude-chat

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Aug 11, 2023Architecture:Transformer0.0K Cold

Norquinal/llama-2-7b-claude-chat is a 7 billion parameter LLaMA-2-7b-hf model fine-tuned by Norquinal using QLoRA (4-bit precision). It was trained on a subset of the claude_multiround_chat dataset, focusing on multi-round conversational interactions. This model is an experimental fine-tune, primarily for exploring conversational AI capabilities based on the LLaMA-2 architecture.

Loading preview...

Model Overview

Norquinal/llama-2-7b-claude-chat is an experimental 7 billion parameter language model based on the LLaMA-2-7b-hf architecture. It was fine-tuned by Norquinal using QLoRA (4-bit precision) on a custom dataset, claude_multiround_chat_1k, which is a randomized subset of a larger 30k sample dataset. The model was trained using the Vicuna 1.1 prompt format, designed for multi-turn chat interactions.

Key Characteristics

  • Base Model: LLaMA-2-7b-hf
  • Fine-tuning Method: QLoRA (4-bit precision)
  • Training Data: claude_multiround_chat_1k dataset, focused on multi-round chat conversations.
  • Prompt Format: Vicuna 1.1 format, optimized for conversational exchanges.

Performance Metrics

While the creator notes this is an experimental model, evaluation results from the Open LLM Leaderboard are available:

  • Avg.: 44.54
  • ARC (25-shot): 54.44
  • HellaSwag (10-shot): 80.66
  • MMLU (5-shot): 46.74
  • TruthfulQA (0-shot): 41.39
  • Winogrande (5-shot): 74.9
  • GSM8K (5-shot): 7.73
  • DROP (3-shot): 5.89

Intended Use

This model is primarily an experimental project by Norquinal to explore fine-tuning LLaMA-2 for conversational tasks. It is not presented as a production-ready or highly optimized solution, but rather as a demonstration of fine-tuning on specific chat datasets.