ignos/LeoScorpius-GreenNode-Platypus-7B-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Dec 15, 2023License:apache-2.0Architecture:Transformer Open Weights Cold

The ignos/LeoScorpius-GreenNode-Platypus-7B-v1 is a 7 billion parameter language model developed by Ignos, based on the Mistral architecture. This model is a merge of viethq188/LeoScorpius-7B-Chat-DPO and GreenNode/GreenNodeLM-7B-v1olet, further fine-tuned with the Open-Platypus dataset. It aims to achieve strong overall comparative results on HuggingFace metrics, with a particular focus on improving reasoning capabilities. The model has a context length of 8192 tokens.

Loading preview...

Model Overview

The ignos/LeoScorpius-GreenNode-Platypus-7B-v1 is a 7 billion parameter language model developed by Ignos, built upon the Mistral architecture. This model is a unique fusion, created by merging two existing models: viethq188/LeoScorpius-7B-Chat-DPO and GreenNode/GreenNodeLM-7B-v1olet. Following this merge, it underwent further fine-tuning using the garage-bAInd/Open-Platypus dataset.

Key Capabilities and Focus

  • Enhanced Reasoning: The primary objective of this model is to deliver improved reasoning capabilities, aiming for competitive performance across various HuggingFace evaluation metrics.
  • Mistral Architecture: Inherits the robust and efficient characteristics of the Mistral-7B-v0.1 base model.
  • Merged Foundation: Benefits from the combined strengths of its constituent models, LeoScorpius-7B-Chat-DPO and GreenNodeLM-7B-v1olet.
  • Platypus Fine-tuning: Leverages the Open-Platypus dataset for specialized instruction following and response generation.

Training Details

The model was trained using a QLoRA approach, which involved merging the QLoRA adapters with the base model. The training utilized a bitsandbytes quantization configuration with load_in_4bit: True and bnb_4bit_quant_type: nf4, running on 4 x Nvidia RTX 4090 GPUs. The development process involved tools like Mergekit and Axolotl 0.3.0.

Intended Use Cases

This model is suitable for applications requiring strong general-purpose language understanding and generation, particularly where improved reasoning is a key requirement. Its merged and fine-tuned nature suggests potential for diverse conversational and instructional tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p