allura-org/remnant-glm4-32b

Warm
Public
32B
FP8
32768
Hugging Face
Overview

Remnant GLM4 32B: Roleplaying and Conversation Model

Remnant GLM4 32B is a 32 billion parameter language model developed by allura-org, specifically fine-tuned for SFW and NSFW roleplaying and conversational applications. Built upon the GLM-4 architecture, this model is designed to generate engaging and contextually rich dialogue.

Key Capabilities

  • Specialized for Roleplaying: Optimized for creating dynamic and immersive SFW and NSFW roleplay scenarios.
  • Extended Context: Features a 32768 token context length, enabling long and coherent conversations.
  • GLM4 Architecture: Benefits from the underlying capabilities of the GLM-4 base model.
  • Quantization Available: GGUF quants are provided by bartowski, with EXL3 and EXL2 planned.

Recommended Usage

  • Chat Template: Utilizes the GLM4 chat template for optimal performance.
  • Sampler Settings: Recommended settings include a temperature of 1.0 and min_p of 0.1.

Training Details

The model was fine-tuned using Axolotl, leveraging the allura-org/inkmix-v3.0 dataset. Training involved 2 epochs with a sequence length of 8192 and QLoRA adaptation.