Name: ibm-ai-platform/Bamba-9B-v1 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ibm-ai-platform

Overview

Bamba-9B-v1 is a 9 billion parameter decoder-only language model developed by ibm-ai-platform, built upon the Mamba-2 architecture. It was trained using a two-stage process, initially on 2 trillion tokens from the Dolma v1.7 dataset, followed by an additional 200 billion tokens from a curated high-quality blend. This two-stage pretraining aims to refine performance and enhance output quality for diverse text generation tasks.

Key Capabilities

Mamba-2 Architecture: Utilizes the Mamba-2 state-space model architecture for efficient sequence processing.
Extensive Pretraining: Trained on a total of 2.2 trillion tokens, ensuring broad language understanding.
Text Generation: Designed to handle a wide range of text generation tasks.
Quantization Support: Provides FP8 quantized versions for more efficient storage and inference, reducing memory usage significantly.
Hugging Face Integration: Fully integrated with Hugging Face Transformers for easy inference and fine-tuning.

Good For

General Text Generation: Suitable for various applications requiring text output.
Research and Development: Offers a Mamba-2 based model for exploring alternative architectures.
Resource-Efficient Deployment: Quantized versions enable deployment in environments with memory constraints.
Fine-tuning: Supports fine-tuning for specific downstream tasks using tools like SFT Trainer.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)