carlosmm26/Atanor-4B
Atanor-4B is a 4.5 billion parameter model, fine-tuned from Qwen3.5-4B by carlosmm26, specifically optimized for agentic tool-use within the Hermes Agent framework. This model demonstrates improved tool selection and task success rates compared to its base model, making it suitable for local, resource-constrained agentic applications. It was trained entirely on a single RTX 3090, focusing on reasoning repair and Hermes tool-use traces.
Loading preview...
Atanor-4B: Agentic Tool-Use Model
Atanor-4B is a 4.5 billion parameter model, fine-tuned from Qwen3.5-4B by carlosmm26, with a specific focus on enhancing agentic tool-use capabilities within the Hermes Agent ecosystem. This model was developed to explore the potential of smaller models for agentic tasks, with its entire training process conducted locally on a single RTX 3090 GPU.
Key Capabilities & Performance
Evaluated on a 60-task Hermes-native agent benchmark, Atanor-4B shows notable improvements over its base model:
- Agent Score: Increased from 0.81 to 0.84.
- Tool Selection: The ability to pick the correct tool for a task doubled from 30% to 60%.
- Task Success: Improved from 67% to 73%.
Training Methodology
The fine-tuning process involved two LoRA stages (BF16) on an RTX 3090:
- Stage A (Reasoning Repair): Utilized Bespoke-Stratos and NuminaMath-CoT datasets.
- Stage B (Hermes Tool-Use): Trained on the
kai-os/carnice-glm5-hermes-tracesdataset, focusing on agentic traces with a sequence length of 16384.
Good For
- Local Agentic Applications: Designed for efficient execution in environments like
llama.cppor Hermes Agent, even on consumer-grade GPUs. - Tool-Use Scenarios: Excels in tasks requiring precise tool selection and execution.
- Experimentation: Ideal for developers interested in exploring and deploying smaller, specialized agent models.