UniReason-Qwen3-14B-RL is a 14 billion parameter, Transformer-based language model developed by ReasoningTransferability, fine-tuned from Qwen3-14B using RL-GRPO. This model is primarily focused on advanced math reasoning capabilities, developed as part of research into the transferability of mathematical reasoning skills to general LLM tasks. It aims to explore how math reasoning training impacts broader problem-solving abilities and general language understanding.
No reviews yet. Be the first to review!