Name: chenjingshen/Llama3-8B-merge-biomed-wizard API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: chenjingshen

Llama3-8B-merge-biomed-wizard Overview

This model, developed by chenjingshen, is an 8 billion parameter language model created through a DARE-TIES merge of three distinct Llama 3-based models: meta-llama/Meta-Llama-3-8B-Instruct, NousResearch/Hermes-2-Pro-Llama-3-8B, and aaditya/Llama3-OpenBioLLM-8B. The merge was implemented using MindNLP Wizard on a MindSpore/Ascend runtime stack, with an output dtype of bfloat16.

Key Capabilities and Performance

The model is designed to enhance performance in both general reasoning and specialized biomedical domains. It shows competitive or superior results compared to its base models across several benchmarks:

General Reasoning: Achieves 70.81% accuracy on GSM8K and 76.01% on Winogrande, outperforming Llama3-8B-Instruct in these areas.
Biomedical Expertise: Demonstrates strong performance on MMLU biomedical subsets, including 77.57% on MMLU-Professional Medicine and 82.00% on MMLU-Medical Genetics.
Merge Method: Utilizes the DARE-TIES method, known for resolving interference when merging models, with meta-llama/Meta-Llama-3-8B-Instruct as the base.

When to Use This Model

Biomedical Applications: Ideal for tasks requiring strong understanding and generation in medical, biological, and clinical contexts.
General-Purpose Reasoning: Suitable for applications needing robust logical and mathematical reasoning capabilities.
Llama 3 Ecosystem: Integrates seamlessly with the Llama 3 prompt format, making it easy to adopt for existing Llama 3 users.

Overview

Llama3-8B-merge-biomed-wizard Overview

Key Capabilities and Performance

When to Use This Model

Full Model Card (README)