cookinai/titanbagel

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Feb 23, 2024License:cc-by-4.0Architecture:Transformer Open Weights Cold

cookinai/titanbagel is an 8 billion parameter language model, a finetune of Jon Durbin's Bagel DPO 7B. It has been further trained using the Hercules 3.0 dataset, offering an 8192 token context length. This model is designed for general language tasks, leveraging its DPO base and additional training for improved performance.

Loading preview...

cookinai/titanbagel: An Enhanced Bagel DPO Model

cookinai/titanbagel is an 8 billion parameter language model built upon Jon Durbin's Bagel DPO 7B. This model has undergone additional finetuning using the Hercules 3.0 dataset, aiming to enhance its capabilities and performance across various language understanding and generation tasks. With an 8192 token context length, it can process and generate longer sequences of text, making it suitable for applications requiring more extensive contextual awareness.

Key Capabilities

  • Finetuned Performance: Leverages the strengths of the Bagel DPO 7B base model, further refined with the Hercules 3.0 dataset.
  • Extended Context: Supports an 8192 token context window, allowing for more comprehensive understanding and generation of longer texts.
  • General Purpose: Suitable for a broad range of natural language processing tasks due to its foundational training and subsequent finetuning.

Good For

  • Applications requiring a model with a solid DPO base and additional training for improved general performance.
  • Tasks benefiting from an 8B parameter model with an 8K context window.
  • Experimentation with models finetuned on specific datasets like Hercules 3.0.