CobraMamba/mamba-gpt-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Sep 24, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

CobraMamba/mamba-gpt-7b is a 7 billion parameter causal language model fine-tuned from OpenLLaMA by CobraMamba. It demonstrates strong performance, surpassing dolly-v2-12b on the Open LLM Leaderboard, and is trained on a diverse dataset including Alpaca, Open Assistant, LIMA, and GPT-4 generated data. This model is optimized for general instruction following and conversational tasks, offering comparable performance to larger models in its class.

Loading preview...

CobraMamba/mamba-gpt-7b: A High-Performing 7B Instruction-Tuned Model

CobraMamba/mamba-gpt-7b is a 7 billion parameter instruction-tuned language model developed by CobraMamba, based on the openlm-research/open_llama_7b_v2 architecture. This model has been fine-tuned on a comprehensive collection of datasets, including Stanford Alpaca, Open Assistant (oasst1), LIMA, CodeAlpaca 20k, GPT-4 Generated Data, and UltraChat.

Key Capabilities & Performance

  • Strong Instruction Following: Fine-tuned on diverse instruction datasets, enabling robust response generation to various prompts.
  • Competitive Benchmarking: Achieves performance that places it among the best 7B models on the Open LLM Leaderboard, notably surpassing dolly-v2-12b.
  • Multilingual Support: Includes multilingual data from Open Assistant and Chinese data from GPT-4 generated datasets, enhancing its language capabilities.
  • Code Understanding: Incorporates CodeAlpaca 20k, suggesting improved performance on code-related tasks.

Ideal Use Cases

  • General-purpose Chatbots: Its strong instruction following and diverse training make it suitable for conversational AI applications.
  • Text Generation: Capable of generating coherent and contextually relevant text for various applications.
  • Research and Development: Provides a solid base for further fine-tuning or experimentation in natural language processing tasks.