tinyllms/qwen2.5-7b-instruct-sft-game24-qlora-16384

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 15, 2026Architecture:Transformer Cold

The tinyllms/qwen2.5-7b-instruct-sft-game24-qlora-16384 model is a 7.6 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct using QLoRA. It features a substantial 16384 token context length and is specifically optimized for tasks related to the Game24 problem. This model's training focused on generating completions for specific prompts, making it suitable for structured reasoning and problem-solving within its specialized domain.

Loading preview...

Model Overview

This model, tinyllms/qwen2.5-7b-instruct-sft-game24-qlora-16384, is a 7.6 billion parameter instruction-tuned variant of the Qwen/Qwen2.5-7B-Instruct base model. It has been fine-tuned using QLoRA, incorporating 4-bit NF4 quantization and LoRA adapters, which were merged prior to upload. A key feature is its 16384 maximum sequence length, allowing for processing longer contexts.

Key Training Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Fine-tuning Method: QLoRA (4-bit NF4 quantization, LoRA rank 64, alpha 128)
  • Context Length: 16384 tokens
  • Dataset: Trained on tinyllms/game24-trajectories, with a focus on examples relevant to the Game24 problem.
  • Loss Calculation: Utilizes completion_only_loss, meaning loss is computed exclusively on assistant completion tokens, masking prompt tokens.
  • Infrastructure: Training was conducted on NVIDIA H100 80GB GPUs using TRL 0.29 + Ray Train.

Primary Differentiator

This model is specifically fine-tuned for tasks related to the Game24 problem, leveraging a specialized dataset. Its training configuration, including the completion_only_loss and the high maximum sequence length, indicates an optimization for generating precise and relevant outputs within this problem domain. This specialization makes it distinct from general-purpose instruction-tuned models, offering enhanced performance for structured reasoning tasks like Game24.