DCAgent/a1-crosscodeeval_java
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 23, 2026License:otherArchitecture:Transformer Cold

DCAgent/a1-crosscodeeval_java is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B, specifically optimized for Java code evaluation tasks. This model leverages a 32,768 token context length and was trained on a specialized dataset for cross-code evaluation. Its primary strength lies in its ability to analyze and process Java code, making it suitable for code-related applications.

Loading preview...

Model Overview

DCAgent/a1-crosscodeeval_java is an 8 billion parameter language model, fine-tuned from the Qwen/Qwen3-8B architecture. This model has been specifically adapted for tasks related to Java code evaluation, utilizing a substantial context window of 32,768 tokens.

Key Capabilities

  • Java Code Evaluation: Specialized fine-tuning on the /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--exp_rpt_crosscodeeval-java_10k_glm_4.7_traces_jupiter/snapshots/a32226762a9eb22eeb7ed132909af2fa5ac3c83e_thinking_preprocessed dataset enhances its performance in understanding and evaluating Java code.
  • Extended Context Window: Benefits from a 32,768 token context length, allowing for the processing of larger code snippets or more complex problem descriptions.

Training Details

The model was trained with a learning rate of 4e-05 over 7 epochs, using a cosine learning rate scheduler with a 0.1 warmup ratio. The training involved a total batch size of 16 across 16 multi-GPU devices, utilizing the AdamW_TORCH_FUSED optimizer.

Intended Use Cases

This model is particularly well-suited for applications requiring in-depth analysis, generation, or evaluation of Java programming code. Its fine-tuned nature suggests improved performance on tasks within this domain compared to general-purpose models.