KnutJaegersberg/deacon-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Sep 20, 2023License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Cold

KnutJaegersberg/deacon-13b is a 13 billion parameter language model fine-tuned on AI-filtered subsets of the Dolphin dataset and EvolInstruct V2, designed for general conversational tasks. It features a 4096-token context length and demonstrates a balanced performance across various benchmarks, including a 57.85% score on ARC (25-shot) and 82.63% on HellaSwag (10-shot). This model is notable for its unique training data composition, aiming for broad applicability without explicit alignment to specific value systems.

Loading preview...

Model Overview

KnutJaegersberg/deacon-13b is a 13 billion parameter large language model developed by KnutJaegersberg. It was fine-tuned using a distinctive approach, leveraging AI-filtered subsets of the Dolphin dataset combined with EvolInstruct V2. This training methodology aims to produce a model with broad conversational capabilities.

Key Characteristics

  • Training Data: Fine-tuned on a unique blend of AI-filtered GPT-4 based Dolphin dataset subsets and EvolInstruct V2.
  • Parameter Count: A substantial 13 billion parameters, offering a balance between performance and computational requirements.
  • Context Length: Supports a context window of 4096 tokens, suitable for handling moderately long inputs and generating coherent responses.
  • Alignment: The model has not been explicitly aligned to specific positive, negative, or bureaucratically prescribed value systems, suggesting a more raw or unfiltered output style.

Performance Highlights

Evaluated on the Open LLM Leaderboard, deacon-13b shows competitive performance across several benchmarks:

  • Avg. Score: 46.78
  • ARC (25-shot): 57.85
  • HellaSwag (10-shot): 82.63
  • MMLU (5-shot): 55.25
  • TruthfulQA (0-shot): 39.33
  • Winogrande (5-shot): 76.32
  • GSM8K (5-shot): 10.39
  • DROP (3-shot): 5.67

Intended Use Cases

Given its training on diverse instruction-following datasets and lack of explicit value alignment, deacon-13b is suitable for:

  • General-purpose AI assistance: Responding to a wide array of user queries and instructions.
  • Exploratory AI research: For developers interested in models with less constrained output characteristics.
  • Creative text generation: Its unique training might lend itself to novel or unconventional outputs.