LLM360/AmberChat
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Nov 30, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

LLM360/AmberChat is a 7 billion parameter instruction-following language model developed by LLM360, built on the LLaMA-7B architecture. Fine-tuned from LLM360/Amber, it excels in conversational AI and general instruction adherence, achieving a strong MT-Bench score of 5.428125. This model is optimized for generating helpful, detailed, and polite responses within a 4096-token context window, making it suitable for various chat-based applications.

Loading preview...

AmberChat: An Instruction-Following LLM

LLM360/AmberChat is a 7 billion parameter instruction-following language model developed by LLM360, part of their Pebble model series. It is fine-tuned from the LLM360/Amber base model and shares the same architecture as LLaMA-7B.

Key Capabilities & Performance

  • Instruction Following: AmberChat is specifically designed for instruction adherence, providing helpful, detailed, and polite answers in conversational settings.
  • Strong Benchmarking: It achieves an MT-Bench score of 5.428125, outperforming its base model (LLM360/Amber) and Falcon-40B-Instruct, and performing comparably to MPT-7B-Chat and Nous-Hermes-13B.
  • LLaMA Architecture: Built on the LLaMA-7B architecture, it features 32 attention heads, 32 hidden layers, and a 4096 hidden size.
  • Context Length: Supports a maximum sequence length of 2048 tokens for training, with a practical context window of 4096 tokens.

Training Details

AmberChat was fine-tuned on a DataMix of 233k rows, including datasets like WizardLM/WizardLM_evol_instruct_V2_196k and icybee/share_gpt_90k_v1. The training involved 3 epochs with a learning rate of 2e-5 and a model_max_length of 2048.

Deployment Options

Users can easily load AmberChat using the Hugging Face transformers library or deploy it via FastChat. Quantized versions are also available for local deployment with Ollama, providing flexibility for various hardware setups.