Rex1090/PEARL-8B
Rex1090/PEARL-8B is an 8 billion parameter multimodal reasoning model developed by Chi Zhang, fine-tuned from Qwen3-VL-8B-Instruct. This model specializes in Perceptual-Evidence Anchored Reinforced Learning for multimodal tasks, offering a 32768 token context length. It is designed for advanced multimodal reasoning applications, leveraging its specialized training for interpreting and processing diverse data types.
Loading preview...
Overview
Rex1090/PEARL-8B is an 8 billion parameter multimodal reasoning model developed by Chi Zhang, building upon the Qwen3-VL-8B-Instruct architecture. This model is specifically designed for Perceptual-Evidence Anchored Reinforced Learning (PEARL), focusing on enhancing multimodal reasoning capabilities. It supports a substantial context length of 32768 tokens, making it suitable for complex tasks requiring extensive input analysis.
Key Capabilities
- Multimodal Reasoning: Excels at tasks that require understanding and integrating information from various modalities.
- Perceptual-Evidence Anchored Learning: Utilizes a reinforced learning approach anchored by perceptual evidence, as detailed in its associated paper (arxiv.org/abs/2511.18437).
- High Context Length: Benefits from a 32768-token context window, allowing for processing of longer and more detailed inputs.
Training Details
The model was fine-tuned using the ViRL39k dataset and leveraged the EasyR1 framework for its training process. Its development is documented in the PEARL GitHub repository.
Good For
- Applications requiring advanced multimodal understanding.
- Research and development in multimodal AI, particularly for reasoning tasks.
- Use cases that can benefit from a model with a large context window for multimodal inputs.