hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 28, 2026Architecture:Transformer Warm

The hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1 model is a 7.6 billion parameter Qwen2ForCausalLM architecture, specifically an experimental math transfer model trained with the verl framework. It represents a transfer experiment from a 7B to a 3B configuration, focusing on mathematical tasks. This model is designed for specialized applications requiring mathematical reasoning capabilities, offering various checkpoint revisions for development flexibility.

Loading preview...

Overview

This repository hosts the hyunseoki/verl-math-transfer-7bi-to-3bi-fix07-pool7to1 model, an experimental math transfer model developed using the verl framework. It is based on the Qwen2ForCausalLM architecture and represents a transfer from a 7 billion parameter configuration down to a 3 billion parameter configuration, specifically optimized for mathematical tasks.

Key Characteristics

  • Architecture: Qwen2ForCausalLM.
  • Parameter Count: 7.6 billion parameters.
  • Context Length: Supports a context length of 32768 tokens.
  • Training Focus: Specialized in mathematical transfer learning experiments using the verl framework.
  • Checkpoints: Includes multiple exported checkpoint revisions (e.g., step-010 to step-070), with main pointing to the latest (step-070).
  • Export Format: Checkpoints are exported from verl FSDP shards into Hugging Face safetensors format.

Use Cases

This model is particularly suited for research and development in:

  • Mathematical Reasoning: Applications requiring strong mathematical problem-solving abilities.
  • Model Compression Research: Exploring the effectiveness of transferring capabilities from larger to smaller models while retaining performance in specific domains.
  • Experimental AI: For developers and researchers interested in verl-based training and transfer learning methodologies for specialized tasks.