rekabytes/hmanlab-ai-v0.2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 17, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The rekabytes/hmanlab-ai-v0.2 is a 4 billion parameter Qwen3-4B variant, fine-tuned for enhanced agentic tool-use, structured reasoning, and conversational reliability. It excels in clean tool-call formatting and identity grounding, making it suitable for applications requiring precise interaction with external tools. This model supports a 32768 token context length and is optimized for scenarios where reliable tool interaction and identity consistency are critical.

Loading preview...

Overview

rekabytes/hmanlab-ai-v0.2 is a 4 billion parameter model, fine-tuned from Qwen3-4B, specifically designed to improve agentic tool-use, structured reasoning, and conversational reliability. This version significantly enhances tool-call format reliability compared to its predecessor, v0.1, ensuring well-formed <tool_call> JSON emissions.

Key Capabilities & Improvements

  • Tool-Call Format Reliability: Achieves 10/10 on tool-call formatting, consistently emitting clean <tool_call> JSON.
  • Identity Grounding: Maintains stable 5/5 identity grounding, rejecting false claims of being Claude, GPT, LLaMA, or Gemini.
  • Tool-Grounded Answers: Provides 3/3 stable tool-grounded answers, citing specific tokens from tool output.
  • Training: Underwent a two-stage QLoRA training process on Qwen3-4B-bnb-4bit, focusing on code breadth, instruction-following, and agentic tool feedback loops.

Known Limitations

While strong in tool reliability, v0.2 has recognized limitations:

  • Reasoning Depth: <think> blocks may be empty for complex reasoning, multi-step math, or logic puzzles.
  • Error Recovery: Tends to defer to the user after tool failure instead of attempting diagnostic tools.
  • Proactive Initiative: Often asks clarifying questions for vague prompts rather than initiating tool calls.
  • Multi-file Synthesis: Due to its 4B parameter capacity, it's not designed for deep cross-file architectural reasoning.

Good For

This model is a strong choice for use cases that prioritize reliable tool-call formatting, tool-grounded answers, and consistent identity. If your application requires precise interaction with external tools and clear identity, v0.2 offers a meaningful upgrade.