h2m/mhm-7b-v1.3-DPO-1
h2m/mhm-7b-v1.3-DPO-1 is a 7 billion parameter language model developed by h2m, fine-tuned using DPO on the Intel/orca_dpo_pairs dataset. Based on the Mistral architecture, this model is the result of multiple merges involving seven different models from the openllm leaderboard. It offers an 8192 token context length and is primarily an experimental model for general language tasks.
Loading preview...
Overview
h2m/mhm-7b-v1.3-DPO-1 is an experimental 7 billion parameter language model, building upon the Mistral architecture. It was created by h2m through a series of merges involving seven distinct models sourced from the openllm leaderboard, utilizing the dare_ties merging technique. The model has been further fine-tuned using Direct Preference Optimization (DPO) on the Intel/orca_dpo_pairs dataset.
Key Characteristics
- Base Model: Derived from the
mhm-7b-v1.3model, which itself is based on Mistral. - Fine-tuning: Enhanced with DPO using the
Intel/orca_dpo_pairsdataset. - Development: Result of an experimental merging process, combining multiple models to achieve its current form.
- Context Length: Supports an 8192 token context window.
Intended Use
This model is presented as an experiment, suitable for researchers and developers interested in exploring the outcomes of complex model merging and DPO fine-tuning on a Mistral-based architecture. It can be applied to general language generation and understanding tasks, with its performance characteristics best evaluated through direct experimentation.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.