18-Death/sq-rot13-base64-gsm8k

TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 16, 2026Architecture:Transformer Cold

The 18-Death/sq-rot13-base64-gsm8k model is a 3.1 billion parameter language model fine-tuned using SFT (Supervised Fine-Tuning) with the TRL library. This model is based on an unspecified base architecture and features a context length of 32768 tokens. It is designed for general text generation tasks, leveraging its fine-tuned capabilities for diverse conversational prompts.

Loading preview...

Overview

The 18-Death/sq-rot13-base64-gsm8k is a 3.1 billion parameter language model that has undergone Supervised Fine-Tuning (SFT) using the TRL library. While the specific base model is not detailed, this fine-tuned version is equipped with a substantial context length of 32768 tokens, allowing it to process and generate longer sequences of text.

Key Capabilities

  • Text Generation: Capable of generating coherent and contextually relevant text based on user prompts.
  • Long Context Handling: Benefits from a 32768-token context window, suitable for tasks requiring extensive input or generating detailed responses.
  • SFT Training: Leverages Supervised Fine-Tuning for improved performance on specific tasks, though the exact nature of the fine-tuning dataset is not specified in the provided information.

Training Details

The model was trained using the SFT method, with the following framework versions:

  • TRL: 1.3.0
  • Transformers: 5.6.2
  • Pytorch: 2.10.0
  • Datasets: 4.8.4
  • Tokenizers: 0.22.2

Good For

  • General text generation applications.
  • Scenarios requiring processing or generating long passages of text.
  • Exploratory use cases where a fine-tuned model with a large context window is beneficial.