WisdomShell/ADG-WizardLM-LLaMa3-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Apr 12, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

WisdomShell/ADG-WizardLM-LLaMa3-8B is an 8 billion parameter LLaMa3-based model developed by Bo Li, Mingda Wang, Shikun Zhang, and Wei Ye, specifically fine-tuned using the Answer Divergence-Guided Selection (ADG) method for instruction data. This model leverages a novel data selection technique that scores instructions based on the geometric structure of multiple sampled answers, rather than single reference responses. It is optimized for improving instruction tuning quality across reasoning, knowledge, and coding tasks under fixed data budgets.

Loading preview...

Overview of ADG-WizardLM-LLaMa3-8B

WisdomShell/ADG-WizardLM-LLaMa3-8B is an 8 billion parameter LLaMa3-based model that utilizes the Answer Divergence-Guided Selection (ADG) method for instruction data selection. Developed by Bo Li, Mingda Wang, Shikun Zhang, and Wei Ye, this approach focuses on improving instruction tuning quality by selecting the most impactful examples under a fixed data budget. Unlike traditional methods that rely on a single reference response, ADG scores instructions by analyzing the geometric structure of multiple answers sampled from a base model using stochastic decoding.

Key Capabilities & Methodology

  • Geometry-Aware Scoring: ADG samples multiple answers for each instruction, maps them into a representation space, and computes scores based on their dispersion magnitude and shape anisotropy.
  • Bin-wise Selection: It performs proportional selection within semantic bins to ensure broad semantic coverage.
  • Improved Instruction Tuning: The method consistently enhances instruction tuning performance across various benchmarks, including reasoning, knowledge, and coding tasks.
  • Practical Pipeline: The repository provides a complete pipeline for multi-sample answer generation, instruction embedding and clustering, ADG scoring and subset selection, model training, and benchmark evaluation.

Good for

  • Researchers and Developers interested in advanced instruction data selection techniques.
  • Improving LLM performance on reasoning, knowledge, and coding tasks with limited data budgets.
  • Understanding and implementing a novel approach to data curation for instruction tuning.