Overview
formulae/7B-Dorflan is an experimental 7 billion parameter language model developed by formulae through a unique merging process. It combines the weights and architectures of three distinct foundation models: StableBeluga-7B, dolphin-llama2-7b, and Marcoroni-7B. This model was created using a custom merging technique, and notably, no additional fine-tuning was performed post-merge. Its primary purpose is for testing and research to evaluate the efficacy and characteristics of merged models.
Key Characteristics & Performance
Dorflan inherits its training data from its constituent models, including datasets like COT, Niv2, t0, FLAN2021, Dolphin, and OpenOrca. Initial evaluations on the Open LLM Leaderboard show an average score of 47.44, with specific scores including 54.44 for ARC (25-shot), 75.78 for HellaSwag (10-shot), 51.36 for MMLU (5-shot), and 51.17 for TruthfulQA (0-shot). The model has a context length of 4096 tokens.
Limitations and Intended Use
As an experimental merge, Dorflan has several known limitations:
- Instability: Potential issues due to the merged architectures.
- Compounded Biases: Inherits and may amplify biases from all three foundation models.
- Performance Variability: May show decreased performance on certain tasks compared to its individual base models.
Due to its untested and experimental nature, Dorflan is strictly intended for testing and research purposes only. It is not recommended for production systems or generating public-facing content, and extensive testing is required to fully understand its capabilities and ethical implications.