cookinai/DonutLM-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Dec 20, 2023License:apache-2.0Architecture:Transformer Open Weights Cold

DonutLM-v1 by cookinai is a 7 billion parameter language model with a 4096-token context length, created through a Slerp merge of AIDC-ai-business/Marcoroni-7B-v3 and jondurbin/bagel-dpo-7b-v0.1. This model leverages the strengths of its base models, offering a unique blend of capabilities for general language understanding and generation tasks. Its primary use case is for developers seeking a merged model that combines different fine-tuning approaches.

Loading preview...

DonutLM-v1: A Slerp Merged 7B Language Model

DonutLM-v1, developed by cookinai, is a 7 billion parameter language model designed for general-purpose text generation and understanding. It stands out as a Slerp merge of two distinct base models: AIDC-ai-business/Marcoroni-7B-v3 and jondurbin/bagel-dpo-7b-v0.1. This merging technique aims to combine the strengths and characteristics of both foundational models, offering a unique blend of capabilities.

Key Capabilities

  • Blended Performance: Inherits and combines features from its constituent models, potentially offering a balanced performance across various NLP tasks.
  • 7 Billion Parameters: Provides a robust size for complex language understanding and generation while remaining efficient for deployment.
  • 4096-token Context: Supports processing and generating longer sequences of text, suitable for tasks requiring broader context.

Good for

  • Experimentation with Merged Models: Ideal for researchers and developers interested in exploring the outcomes of Slerp merging techniques.
  • General Language Tasks: Suitable for a wide range of applications including text completion, summarization, and conversational AI where a blended model approach is desired.
  • Building upon Existing Fine-tunes: Offers a starting point that integrates the fine-tuning efforts of its base models.