Name: arcee-ai/WitchLM-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: arcee-ai

WitchLM-1.5B: A Compact Model with Extensive Context

WitchLM-1.5B, developed by arcee-ai, is a 1.5 billion parameter language model notable for its exceptionally large context window of 131,072 tokens. Built using the Axolotl framework, this model is designed for general language understanding tasks.

Key Capabilities

Broad Context Handling: Processes and understands information across a very long context of up to 131,072 tokens, enabling deep contextual comprehension.
General Language Understanding: Benchmarks indicate performance across various tasks, including instruction following (inst_level_strict_acc: 0.3357), general reasoning (acc_norm on BBH: 0.3591), and MMLU (acc: 0.2441).
Efficient Training: Trained with specific hyperparameters including a learning rate of 5e-05, a total batch size of 64, and 5 epochs, utilizing a cosine learning rate scheduler.

Good For

Applications requiring a model with a very large input capacity to process extensive documents or conversations.
Use cases where a smaller parameter count is preferred for efficiency, without sacrificing significant context length.
General text generation and understanding tasks where the ability to maintain long-term coherence is crucial.