ApatheticWithoutTheA/gemma-2-2b-it-R1-Reasoning

TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Feb 23, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

ApatheticWithoutTheA/gemma-2-2b-it-R1-Reasoning is a 2.6 billion parameter instruction-tuned model, fine-tuned by ApatheticWithoutTheA from the gemma-2-2b-it base. Optimized using MLX and LoRA on the sequelbox/Raiden-DeepSeek-R1 dataset, it excels at instruction-following and complex reasoning tasks, generating detailed chain-of-thought responses. This model is particularly suited for question answering and reasoning-based applications on consumer hardware, with a context length of 8192 tokens.

Loading preview...

ApatheticWithoutTheA/gemma-2-2b-it-R1-Reasoning Overview

This model is a specialized fine-tuned version of the gemma-2-2b-it base, developed by ApatheticWithoutTheA. It has been specifically optimized for enhanced instruction-following and complex reasoning capabilities. The fine-tuning process utilized MLX and LoRA on the sequelbox/Raiden-DeepSeek-R1 dataset, which comprises 62.9k examples generated by Deepseek R1, over 600 iterations.

Key Capabilities

  • Advanced Reasoning: Generates detailed chain-of-thought reasoning for complex problems, improving upon the base model's ability to process intricate instructions.
  • Instruction Following: Highly proficient in understanding and executing user instructions.
  • Question Answering: Delivers straightforward answers for simple queries and elaborate reasoning for more challenging questions.
  • Coding: Capable of assisting with coding tasks.

Good For

  • Applications requiring robust reasoning-based problem-solving.
  • Question answering systems that benefit from detailed explanations.
  • Coding assistance and related tasks.
  • Deployment on consumer hardware due to its efficient architecture.

Limitations

While generally effective, the model may occasionally fail to trigger chain-of-thought reasoning for complex problems without an explicit prompt. For extremely difficult reasoning tasks, it can sometimes enter prolonged "thinking" loops without reaching a conclusive answer.