prithivMLmods/QwQ-LCoT-3B-Instruct

Warm
Public
3.1B
BF16
32768
License: creativeml-openrail-m
Hugging Face
Overview

Overview

The QwQ-LCoT-3B-Instruct is a 3.1 billion parameter instruction-tuned language model developed by prithivMLmods. It is built upon the Qwen2.5-3B-Instruct base model and fine-tuned using the QwQ-LongCoT-130K dataset, which comprises 133,000 annotated samples focused on logical tasks and structured thinking. The model's primary differentiator is its specialization in long-chain-of-thought (LCoT) reasoning, enabling it to generate comprehensive, step-by-step explanations for complex queries.

Key Features

  • Long Chain-of-Thought Reasoning: Designed to produce detailed, step-by-step explanations for intricate problems.
  • Lightweight and Efficient: Optimized for environments with limited computational resources due to its 3 billion parameters, without sacrificing reasoning capabilities.
  • Instruction Optimization: Fine-tuned to accurately follow prompts and deliver structured, actionable responses.

Capabilities

  • Text Generation: Provides detailed, structured, and logical text outputs.
  • Reasoning Tasks: Capable of solving step-by-step problems in mathematics, logic, and science.
  • Educational Assistance: Generates coherent explanations for academic and research purposes.
  • Dialogue and Summarization: Handles conversational queries and effectively summarizes long documents.

Training Details

The model was fine-tuned on the Qwen2.5-3B-Instruct base model using the amphora/QwQ-LongCoT-130K dataset.