DatOneStormyz/Solor-TXT-7B-Ultra

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 16, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Solor-TXT-7B-Ultra is a 7.6 billion parameter instruction-tuned language model developed by DatOneStormyz, based on Qwen2.5-7B-Instruct, with a 32768 token context length. It is specifically optimized for deep reasoning and complex multi-turn conversations by utilizing an internal thought process. This model excels at maintaining logical coherence and strategic planning across extended dialogues.

Loading preview...

Solor-TXT-7B-Ultra: Enhanced Reasoning for Complex Conversations

Solor-TXT-7B-Ultra, developed by DatOneStormyz, is a fine-tuned version of the Qwen2.5-7B-Instruct model, specifically engineered for advanced reasoning and robust multi-turn conversational capabilities. With 7.6 billion parameters and a 32768 token context length, this model is designed to "think before it speaks" by processing complex context and strategy within a dedicated <thought> block.

Key Capabilities

  • Advanced Reasoning: Incorporates specialized thought-patterns to significantly improve logical consistency in long and intricate conversations.
  • Multi-Turn Mastery: Trained on a carefully filtered subset of the UltraChat 200k dataset (2,500 curated turns) to ensure stable and high-quality dialogue performance.
  • Strategic Processing: Utilizes an internal thought mechanism, activated by a specific system prompt, to enhance strategic planning and coherence in responses.

Training Details

This model was fine-tuned using QLoRA (Rank 16) on an NVIDIA A100, leveraging BF16 precision. Its training on UltraChat 200k focuses on developing its ability to handle extended, complex conversational flows effectively.

Recommended Use

To fully leverage its reasoning capabilities, users should employ the suggested system prompt: "You are Solor-TXT. You must think deeply inside tags before every response, especially in long conversations." This model is ideal for applications requiring deep logical processing and sustained, coherent dialogue.