tinyllms/qwen2.5-7b-instruct-sft-game24-qlora

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 15, 2026Architecture:Transformer Cold

The tinyllms/qwen2.5-7b-instruct-sft-game24-qlora model is a 7.6 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-7B-Instruct. It was specifically trained using QLoRA on the tinyllms/game24-trajectories dataset, making it highly specialized for solving the Game24 mathematical puzzle. This model excels at generating solutions for Game24 problems, leveraging its targeted fine-tuning for this specific reasoning task.

Loading preview...

Model Overview

The tinyllms/qwen2.5-7b-instruct-sft-game24-qlora is a 7.6 billion parameter instruction-tuned model, fine-tuned from the base Qwen/Qwen2.5-7B-Instruct architecture. Its primary distinction lies in its specialized training on the tinyllms/game24-trajectories dataset, making it highly proficient in solving the Game24 mathematical puzzle.

Key Training Details

  • Base Model: Qwen/Qwen2.5-7B-Instruct
  • Fine-tuning Method: QLoRA (4-bit NF4 quantization with double quantization)
  • Targeted Task: Game24 puzzle solving
  • Dataset: tinyllms/game24-trajectories
  • Loss Calculation: completion_only_loss, focusing on assistant completion tokens.
  • Hardware: NVIDIA H100 80GB GPUs
  • Framework: TRL 0.29 + Ray Train

When to Use This Model

  • Specialized Game24 Solver: This model is specifically optimized for generating solutions to the Game24 puzzle. Its fine-tuning on a dedicated dataset for this task means it should perform well in this niche.
  • Research on Task-Specific Fine-tuning: Ideal for researchers exploring the impact of highly specialized instruction-tuning on a base model for a particular reasoning challenge.
  • Efficient Deployment: Leveraging QLoRA, the model benefits from 4-bit quantization, which can lead to more efficient inference compared to full-precision models of similar size, especially for its targeted use case.