KnutJaegersberg/deacon-13b
KnutJaegersberg/deacon-13b is a 13 billion parameter language model fine-tuned on AI-filtered subsets of the Dolphin dataset and EvolInstruct V2, designed for general conversational tasks. It features a 4096-token context length and demonstrates a balanced performance across various benchmarks, including a 57.85% score on ARC (25-shot) and 82.63% on HellaSwag (10-shot). This model is notable for its unique training data composition, aiming for broad applicability without explicit alignment to specific value systems.
Loading preview...
Model Overview
KnutJaegersberg/deacon-13b is a 13 billion parameter large language model developed by KnutJaegersberg. It was fine-tuned using a distinctive approach, leveraging AI-filtered subsets of the Dolphin dataset combined with EvolInstruct V2. This training methodology aims to produce a model with broad conversational capabilities.
Key Characteristics
- Training Data: Fine-tuned on a unique blend of AI-filtered GPT-4 based Dolphin dataset subsets and EvolInstruct V2.
- Parameter Count: A substantial 13 billion parameters, offering a balance between performance and computational requirements.
- Context Length: Supports a context window of 4096 tokens, suitable for handling moderately long inputs and generating coherent responses.
- Alignment: The model has not been explicitly aligned to specific positive, negative, or bureaucratically prescribed value systems, suggesting a more raw or unfiltered output style.
Performance Highlights
Evaluated on the Open LLM Leaderboard, deacon-13b shows competitive performance across several benchmarks:
- Avg. Score: 46.78
- ARC (25-shot): 57.85
- HellaSwag (10-shot): 82.63
- MMLU (5-shot): 55.25
- TruthfulQA (0-shot): 39.33
- Winogrande (5-shot): 76.32
- GSM8K (5-shot): 10.39
- DROP (3-shot): 5.67
Intended Use Cases
Given its training on diverse instruction-following datasets and lack of explicit value alignment, deacon-13b is suitable for:
- General-purpose AI assistance: Responding to a wide array of user queries and instructions.
- Exploratory AI research: For developers interested in models with less constrained output characteristics.
- Creative text generation: Its unique training might lend itself to novel or unconventional outputs.