Edens-Gate/Henbane-7b-attempt2

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Sep 13, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Edens-Gate/Henbane-7b-attempt2 is a 7.6 billion parameter causal language model fine-tuned from Qwen/Qwen2-7B. This model is built using Axolotl and features Liger optimizations for improved performance. It is designed for general text generation tasks, demonstrating a validation loss of 1.0222 and an average score of 23.47 on the Open LLM Leaderboard.

Loading preview...

Overview

Edens-Gate/Henbane-7b-attempt2 is a 7.6 billion parameter language model, fine-tuned from the Qwen/Qwen2-7B base model. It was developed using the Axolotl framework, incorporating Liger optimizations such as liger_rope, liger_rms_norm, liger_swiglu, and liger_fused_linear_cross_entropy for enhanced efficiency and performance.

Training Details

The model underwent training for 2 epochs with a learning rate of 2e-05, utilizing a total batch size of 64 across multiple GPUs. It was fine-tuned on a diverse collection of ShareGPT-formatted datasets, including PocketDoc/Dans-MemoryCore-CoreCurriculum-Small, anthracite-org/kalo_opus_misc_240827, and AquaV/Chemical-Biological-Safety-Applications-Sharegpt, among others. The training achieved a final validation loss of 1.0222.

Performance Metrics

On the Open LLM Leaderboard, Henbane-7b-attempt2 achieved an average score of 23.47. Specific benchmark results include:

  • IFEval (0-Shot): 41.57
  • BBH (3-Shot): 30.87
  • MMLU-PRO (5-shot): 33.64

Intended Uses

This model is suitable for general-purpose text generation and conversational AI applications, leveraging its fine-tuned capabilities from the Qwen2-7B architecture.