jesusvilela/igbundle-qwen2.5-7b-riemannian

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jan 10, 2026Architecture:Transformer Warm

The jesusvilela/igbundle-qwen2.5-7b-riemannian model is a 7.6 billion parameter Qwen2.5-7B base model fine-tuned with ManifoldGL, a novel parameter-efficient method that enforces Information-Geometric constraints. It models the semantic latent space as a Fiber Bundle over a Hyperbolic Base Manifold, providing an inductive bias for hierarchical concept organization. This approach aims to enhance abstract reasoning and systematic generalization by maintaining hyperbolic geometry and sheaf consistency, demonstrating strong performance on benchmarks like ARC-AGI and GSM8K.

Loading preview...

ManifoldGL: Information-Geometric Bundle Adapters for LLMs

This model, jesusvilela/igbundle-qwen2.5-7b-riemannian, is a 7.6 billion parameter Qwen2.5-7B base model enhanced with ManifoldGL, a unique parameter-efficient fine-tuning method. Unlike standard LoRA, ManifoldGL operates in a non-Euclidean latent space, modeling semantics as a Fiber Bundle over a Hyperbolic Base Manifold (Poincaré Ball with constant curvature $\kappa = -1$). This geometric approach provides a strong inductive bias for organizing hierarchical concepts, aiming to improve abstract reasoning.

Key Capabilities & Innovations

  • Hyperbolic Inductive Bias: Enforces hyperbolic geometry to efficiently embed hierarchical trees, preventing "Semantic Drift" common in flat Euclidean spaces.
  • Information-Geometric Constraints: Utilizes Differential Geometry and Sheaf Theory to ensure local consistency and maintain geometric integrity of learned representations.
  • Enhanced Reasoning: Achieves perfect preservation of general reasoning on ARC-Challenge (0% degradation) and strong performance on GSM8K (75.51%), indicating robust multi-step reasoning.
  • Parameter-Efficient: Injected as a bottleneck adapter, it offers net efficiency gains despite per-step overhead due to reduced training steps (30% fewer than LoRA baseline).

When to Use This Model

  • Abstract Reasoning Tasks: Ideal for applications requiring systematic generalization and abstract problem-solving, as demonstrated by its ARC-AGI performance.
  • Hierarchical Data Processing: Suitable for use cases where understanding and organizing hierarchical concepts are crucial.
  • Research & Development: Excellent for exploring novel geometric approaches in LLM fine-tuning and understanding the impact of non-Euclidean latent spaces on AI capabilities.