mjf-su/ADEn-MAC-CF
mjf-su/ADEn-MAC-CF is a 4 billion parameter language model developed by mjf-su, fine-tuned from mjf-su/PhysicalAI-reason-VLA-MetaAction-1e. It was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities, and supports a context length of 32768 tokens. This model is optimized for tasks requiring advanced mathematical reasoning and complex problem-solving, building upon its base model's physical AI reasoning foundation.
Loading preview...
Overview
mjf-su/ADEn-MAC-CF is a 4 billion parameter language model developed by mjf-su, fine-tuned from the mjf-su/PhysicalAI-reason-VLA-MetaAction-1e base model. It leverages the Transformer Reinforcement Learning (TRL) framework for its training process.
Key Capabilities
- Enhanced Mathematical Reasoning: The model was specifically trained using the GRPO (Guided Reinforcement Policy Optimization) method, as introduced in the "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" paper. This training approach aims to significantly improve its performance on mathematical reasoning tasks.
- Fine-tuned from a Physical AI Reasoning Model: Building upon its base model, ADEn-MAC-CF is designed to integrate and extend capabilities related to physical AI reasoning with advanced mathematical problem-solving.
- Large Context Window: Supports a substantial context length of 32768 tokens, allowing it to process and generate longer, more complex sequences of text.
Training Details
The model's training procedure utilized GRPO, a method focused on pushing the boundaries of mathematical reasoning in open language models. The training run details are available via Weights & Biases. Key framework versions used include TRL 0.26.1, Transformers 4.57.6, and PyTorch 2.10.0.
Good for
- Applications requiring strong mathematical reasoning.
- Tasks that benefit from a large context window for complex problem-solving.
- Research and development in advanced AI reasoning, particularly in mathematical and physical domains.