TheBloke/UltraLM-13B-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 29, 2023License:otherArchitecture:Transformer0.0K Cold

UltraLM-13B-fp16 is a 13 billion parameter language model developed by Open BMB, fine-tuned from LLaMA-13b. It is specifically designed for multi-turn chat applications, leveraging the UltraChat dataset for its training. This model is provided in a float16 PyTorch format, suitable for GPU inference and further conversions. Its primary strength lies in generating coherent and contextually relevant responses in conversational settings.

Loading preview...

UltraLM-13B-fp16: A Chat-Optimized LLaMA Variant

UltraLM-13B-fp16 is a 13 billion parameter language model, originating from Open BMB's UltraLM-13b project. It is a fine-tuned version of the LLaMA-13b base model, specifically optimized for multi-turn conversational interactions.

Key Capabilities

  • Multi-turn Chat: The model is trained on the UltraChat dataset, enabling it to handle complex, multi-turn dialogues effectively.
  • LLaMA-based Architecture: Built upon the robust LLaMA-13b foundation, it inherits strong language understanding and generation capabilities.
  • FP16 Format: Provided in a float16 PyTorch format, making it suitable for efficient GPU inference and as a base for further quantization or fine-tuning.

Usage and Licensing

UltraLM-13B-fp16 requires a specific prompt template for optimal performance, following a User: instruction<eos_token> Assistant: response<eos_token> structure. Users should note that the model's license is inherited from LLaMA's model license.

Good For

  • Developing chatbots and conversational AI agents.
  • Applications requiring coherent and context-aware responses in dialogue.
  • Researchers and developers looking for a LLaMA-based model optimized for chat.