arcee-ai/GLM-4-32B-Base-32K

Cold
Public
32B
FP8
32768
License: mit
Hugging Face
Overview

GLM-4-32B-Base-32K Overview

GLM-4-32B-Base-32K is a 32 billion parameter language model developed by arcee-ai, building upon THUDM's GLM-4-32B-Base-0414. Its primary differentiator is its significantly enhanced long-context capability, maintaining strong performance up to a 32,000-token context window, whereas the original model's capabilities degraded beyond 8,192 tokens.

Key Capabilities & Improvements

  • Extended Context Window: Reliably processes information across a 32,000-token context, a substantial improvement over the base model's effective 8,000 tokens.
  • Improved Recall: Demonstrates significantly better performance on Needle in a Haystack (NIAH) benchmarks at longer context lengths, with averages of 98.3% at 16,384 tokens and 76.5% at 32,768 tokens, compared to the base model's 66.1% and 0.4% respectively.
  • Enhanced General Benchmarks: Achieves approximate 5% overall improvement on standard base model benchmarks, including arc_challenge (64.93%), mmlu (77.87%), and winogrande (80.03%).
  • Development Methodology: Achieved through targeted long-context continued pretraining, iterative merging of model checkpoints, and short-context distillation to retain initial capabilities.

Use Cases

This model is designed as a robust base for continued training, particularly for applications that require deep understanding and processing of extensive textual data. Its strong long-context performance makes it suitable for tasks such as:

  • Summarization of long documents
  • Question answering over large text corpora
  • Context-aware content generation
  • Any application demanding reliable information retrieval and processing across extended inputs.