Overview
GL-Marvin-32k-32B: A GLM-4 Based Model with Extended Context
GL-Marvin-32k-32B is a 32 billion parameter language model developed by ConicCat, built upon the robust GLM-4 architecture. This model has undergone supervised fine-tuning (SFT) with a focus on enhancing its context handling capabilities and optimizing performance on Alpaca evaluations. A key feature is its substantial 32,768 token context window, which is designed to be highly efficient, requiring only 2GB of VRAM to utilize the full context. This allows for the deployment of quantized versions (e.g., Q4_K_M) on a single 24GB GPU, offering best-in-class context capacity for a dense 32B model in its hardware footprint.
Key Capabilities
- Extended Context: Supports a 32,768 token context length, enabling processing of longer inputs and maintaining conversational coherence over extended interactions.
- VRAM Efficiency: Optimized for low VRAM usage, making it accessible for deployment on consumer-grade GPUs.
- GLM-4 Architecture: Leverages the strong foundational capabilities of the GLM-4 base model.
- Alpaca Evaluation Focus: Fine-tuned with an emphasis on maximizing performance on Alpaca-style evaluations.
Training and Influences
The model's development acknowledges contributions from:
- Thudm: For the foundational GLM-4 Base model.
- nyu-dice-lab: For the WildChat-50M dataset.
- AI2: For the Tulu dataset.
Good For
- Applications requiring processing of long documents or extensive conversational history.
- Developers seeking a powerful 32B model that can run efficiently on single consumer GPUs.
- Tasks benefiting from a model fine-tuned for general language understanding and generation, with a focus on context retention.