Name: Mrw33554432/bitLinear-phi-1.5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Mrw33554432

Overview

Mrw33554432/bitLinear-phi-1.5 is a 1.4 billion parameter language model built upon the phi-1.5 architecture. Its core innovation lies in the partial implementation of the BitLinear quantization method, specifically applying 1-bit quantization to the weights of its linear layers (excluding the lm_head). This approach aims to isolate and evaluate the effectiveness of binary weight quantization as described in the paper "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits" (arXiv:2402.17764), without incorporating other components like RMSnorm or activation quantization.

Key Characteristics

Architecture: Based on Microsoft's phi-1.5, with custom BitLinear layers replacing standard linear layers.
Quantization: Implements 1-bit quantization for weights in most linear layers, focusing on efficiency research.
Training Data: Trained on a small subset (100,000 samples) of the English Wikipedia dataset for research validation.
Performance Note: The current kernel is not optimized for 1-bit matrix operations, leading to slower inference. Faster inference (3x) is possible with a custom kernel available on the project's GitHub.

Research Focus

This model serves as a research vehicle to understand the implications and performance of 1-bit weight quantization in LLMs. It highlights the potential for reduced memory footprint and computational cost, though current inference speed is limited by unoptimized kernels. Developers interested in exploring efficient model architectures and quantization techniques will find this model particularly relevant.

Overview

Overview

Key Characteristics

Research Focus

Full Model Card (README)