IIGroup/X-Coder-RL-Qwen3-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 10, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

IIGroup/X-Coder-RL-Qwen3-8B is an 8 billion parameter code reasoning foundation model developed by IIGroup, built upon the X-Coder-SFT-Qwen3-8B base model. It is specifically trained with Reinforcement Learning from Human Feedback (RLHF) on fully synthetic RL data, utilizing the GRPO training method. This model achieves strong performance in competitive programming tasks, making it highly suitable for code generation, problem-solving, and advanced code reasoning applications.

Loading preview...

X-Coder-RL-Qwen3-8B: Code Reasoning Foundation Model

IIGroup's X-Coder-RL-Qwen3-8B is an 8 billion parameter language model specifically engineered for advanced code reasoning. It is built on the IIGroup/X-Coder-SFT-Qwen3-8B base model and uniquely trained using the GRPO (Generalized Reinforcement Learning with Policy Optimization) method on a fully synthetic dataset, IIGroup/X-Coder-RL-40k. This specialized training approach, part of the X-Coder RLVR recipe, enables the model to excel in complex competitive programming scenarios.

Key Capabilities

  • Superior Code Reasoning: Achieves strong performance on competitive programming benchmarks, as demonstrated by its average performance on LiveCodeBench v5 & v6.
  • Reinforcement Learning Enhanced: Leverages RLVR (Reinforcement Learning from Virtual Rewards) on synthetic data for optimized code generation and problem-solving.
  • Python Code Generation: Capable of generating functional Python code, as shown in examples for common algorithmic problems.
  • High Context Length: Supports a context length of up to 32,768 tokens, beneficial for handling larger codebases or complex problem descriptions.

Good For

  • Competitive Programming: Ideal for tasks requiring advanced algorithmic understanding and code generation.
  • Code Generation: Generating solutions for programming challenges and general coding tasks.
  • Code Reasoning Applications: Any use case demanding high-fidelity code understanding and logical problem-solving within a coding context.

For detailed training information and code, refer to the X-Coder GitHub repository.