Name: budecosystem/boomer-1b API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: budecosystem

Overview

budecosystem/boomer-1b is a 1.1 billion parameter language model developed by BudEcosystem, pretrained from scratch on a custom-curated dataset of 41 billion tokens. This model incorporates a custom architecture with flash attention and an increased intermediate MLP layer dimension. The training dataset is a diverse combination of wiki, stories, arxiv, math, and code.

Key Capabilities

Custom Architecture: Features flash attention and a higher intermediate MLP dimension for potentially improved efficiency and performance.
Pretrained from Scratch: Developed using a unique 41 billion token dataset, allowing for distinct characteristics.
Fine-tuning Support: Provides scripts for easy fine-tuning on custom datasets using finetune.py.
Inference Generation: Includes a generate.py script for straightforward text generation from the Hugging Face model hub.

Performance Benchmarks

Evaluations on several benchmarks show its initial performance:

ARC: 22.35
MMLU: 25.92
Human Eval: 6.1
Hellaswag: 31.66
BBH: 28.65
DROP: 6.13
GSM8K: 1.5

Good For

Retrieval Augmentation: Can be integrated into systems requiring augmented information retrieval.
Inference at the Edge: Its smaller parameter count makes it suitable for deployment in resource-constrained environments.
Language Modeling Use Cases: Applicable for various general language understanding and generation tasks.

Overview

Overview

Key Capabilities

Performance Benchmarks

Good For

Full Model Card (README)