TheBloke/CAMEL-13B-Combined-Data-SuperHOT-8K-fp16
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 27, 2023License:otherArchitecture:Transformer0.0K Cold

TheBloke/CAMEL-13B-Combined-Data-SuperHOT-8K-fp16 is a 13 billion parameter language model, a merge of Camel AI's CAMEL-13B-Combined-Data and Kaio Ken's SuperHOT 8K LoRA. This model is optimized for an extended context length of 8K tokens, leveraging the SuperHOT technique for enhanced long-context understanding. It is fine-tuned on a diverse dataset including CAMEL framework conversations, ShareGPT, and Alpaca instructions, making it suitable for various chat and instruction-following applications requiring longer conversational memory.

Loading preview...

Model Overview

This model, CAMEL-13B-Combined-Data-SuperHOT-8K-fp16, is a 13 billion parameter language model created by merging Camel AI's CAMEL-13B-Combined-Data with Kaio Ken's SuperHOT 8K LoRA. The primary differentiator is its extended context window of 8K tokens, achieved through the SuperHOT technique, which allows for processing and generating longer sequences of text.

Key Capabilities

  • Extended Context Understanding: Leverages an 8K token context length, enabling the model to maintain coherence and draw information from much longer inputs compared to standard models.
  • Instruction Following: Fine-tuned on the Alpaca dataset, enhancing its ability to understand and execute instructions.
  • Conversational AI: Benefits from training on 229K conversations collected via the CAMEL framework and 100K English public conversations from ShareGPT, making it proficient in chat-based interactions.
  • Performance: The base CAMEL-13B model scores an average of 58.9 on the EleutherAI language model evaluation harness, outperforming LLaMA-13B and closely matching Vicuna-13B on various benchmarks like ARC-C, HellaSwag, MMLU, and TruthfulQA.

Good For

  • Applications requiring long-form text generation or analysis.
  • Complex conversational agents that need to remember and reference earlier parts of a dialogue.
  • Instruction-based tasks where detailed context is crucial for accurate responses.
  • Scenarios where a larger context window is more important than raw parameter count for specific tasks.