CobraMamba/mamba-gpt-7b-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Oct 14, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CobraMamba/mamba-gpt-7b-v2 is a 7 billion parameter causal language model fine-tuned from Mistral-7B-v0.1. This model demonstrates strong performance across various benchmarks, achieving an average score of 54.85 on the Open LLM Leaderboard. With an 8192 token context length, it is suitable for general-purpose language generation and understanding tasks.

Loading preview...

CobraMamba/mamba-gpt-7b-v2: A Fine-Tuned Mistral Model

CobraMamba/mamba-gpt-7b-v2 is a 7 billion parameter causal language model, fine-tuned from the robust mistralai/Mistral-7B-v0.1 base model. This iteration focuses on enhancing performance across a range of evaluation subtasks, aiming for strong general-purpose capabilities.

Key Capabilities & Performance

This model has been evaluated on the Open LLM Leaderboard, demonstrating competitive performance:

  • Average Score: 54.85
  • ARC (25-shot): 61.95
  • HellaSwag (10-shot): 83.83
  • MMLU (5-shot): 61.74
  • TruthfulQA (0-shot): 46.63
  • Winogrande (5-shot): 78.45
  • GSM8K (5-shot): 17.29
  • DROP (3-shot): 34.07

These metrics indicate its proficiency in common reasoning, commonsense, and language understanding tasks. The model supports a context length of 8192 tokens, making it suitable for processing moderately long inputs.

Usage

Developers can easily integrate mamba-gpt-7b-v2 using the Hugging Face transformers library, requiring transformers (v4.34.0 or newer), accelerate, and torch for deployment on GPU-enabled machines.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p