Name: chargoddard/piano-medley-7b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: chargoddard

Model Overview

chargoddard/piano-medley-7b is a 7 billion parameter language model developed by chargoddard, based on the mistralai/Mistral-7B-v0.1 architecture. This model represents an experimental approach combining multiple fine-tuned checkpoints through a TIES (Trimming, Iterative Merging, and Self-Distillation) merge method.

Key Development Steps

The model's creation involved several stages, building upon previous experiments like loyal-piano-m7:

Initial Training: loyal-piano-m7 was trained.
cDPO Fine-tuning: loyal-piano-m7 underwent Conditional DPO (cDPO) using the HuggingFaceH4/ultrafeedback_binarized dataset, resulting in loyal-piano-m7-cdpo.
Parallel Training: Another model, servile-harpsichord, was trained with different sampling from the same source datasets as loyal-piano.
cDPO on servile-harpsichord: servile-harpsichord was then fine-tuned with cDPO using allenai/ultrafeedback_binarized_cleaned, Intel/orca_dpo_pairs, and a helpfulness-only version of PKU-Alignment/PKU-SafeRLHF.
TIES Merge: The final piano-medley-7b model was created by performing a TIES merge of several checkpoints from servile-harpsichord-cdpo with loyal-piano-m7-cdpo.

Performance and Usage

Local benchmarks indicate that the merged piano-medley-7b model outperforms its individual constituent components. It is instruction-tuned to respond to the Alpaca prompt format, making it suitable for various conversational and instruction-following applications. The merge configuration utilized a density of 0.4 and int8_mask for efficiency.

Overview

Model Overview

Key Development Steps

Performance and Usage

Full Model Card (README)