Undi95/Dawn-v2-70B

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Nov 6, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

Undi95/Dawn-v2-70B is a 69 billion parameter merged language model created by Undi95 using the layer shuffle method from mergekit. It combines several 70B base models and LoRAs, incorporating psychological and medical data, and is further fine-tuned with LimaRP. This model is designed for broad conversational and instruction-following tasks, leveraging diverse training data for enhanced general-purpose utility.

Loading preview...

Undi95/Dawn-v2-70B: A Merged 70B Language Model

Undi95/Dawn-v2-70B is a 69 billion parameter model developed by Undi95, leveraging the advanced layer shuffle method from mergekit. This approach allows for a sophisticated combination of multiple base models and LoRAs, aiming to integrate their strengths.

Key Capabilities & Composition

This model is built upon a diverse foundation, incorporating elements from:

  • Base Models: Sao10K/Euryale-1.3-L2-70B, Xwin-LM/Xwin-LM-70B-V0.1, ehartford/Samantha-1.11-70b, NousResearch/Nous-Hermes-Llama2-70b, augtoma/qCammel-70-x, jondurbin/airoboros-l2-c70b-3.1.2.
  • LoRAs: fangloveskari/ORCA_LLaMA_70B_QLoRA, and a final application of Doctor-Shotgun/limarpv3-llama2-70b-qlora.

The merging strategy specifically includes psychological and medical data, alongside general instruction-following datasets, to create a versatile model. The layer shuffle technique allows for granular control over which layers from contributing models are used, aiming for an optimized blend of capabilities.

Usage and Quantization

Users can find fp16 files in this repository. Additionally, a measurement.json file is provided, enabling users to perform their own exl2 quantizations using datasets like wikitext.