traversaal-ai/traversaal-2.5-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 31, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

traversaal-ai/traversaal-2.5-Mistral-7B is a 7 billion parameter language model developed by traversaal-ai, fine-tuned using Direct Preference Optimization (DPO) from the teknium/OpenHermes-2.5-Mistral-7B base model. It features a 4096 token context length and incorporates hyperparameter optimizations. This model is designed for general language tasks, building upon a base model that was supervised fine-tuned with LoRA using QWEN-72B.

Loading preview...

traversaal-2.5-Mistral-7B Overview

traversaal-2.5-Mistral-7B is a 7 billion parameter language model developed by traversaal-ai. It is built upon the teknium/OpenHermes-2.5-Mistral-7B as its base model, which itself was Supervised Fine-Tuned (SFT) using LoRA with the QWEN-72B model. A key differentiator for this model is its training methodology:

Key Capabilities & Training

  • Direct Preference Optimization (DPO): The model was fine-tuned using DPO, a method known for aligning models with human preferences without requiring a separate reward model.
  • Hyperparameter Optimizations: traversaal-ai implemented several optimizations in hyperparameters during the DPO training phase to enhance performance.
  • No Weight Merging: The development explicitly states that no form of weight merging was exploited, indicating a direct DPO application on the base model.
  • Mistral-7B Compatibility: For leaderboard submissions, the trained weights are realigned to ensure compatibility with the standard Mistral-7B architecture.

Good For

  • General Language Tasks: Suitable for a wide range of applications benefiting from a 7B parameter model.
  • Preference-Aligned Outputs: The DPO training suggests improved alignment with desired output characteristics and user preferences.
  • Developers seeking a Mistral-7B variant: Offers a DPO-tuned alternative based on a strong SFT foundation.