Name: fesalfayed/gpt-oss-20b-hermes_agent-tool-finetune_4bit API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: fesalfayed

Model Overview

This model, fesalfayed/gpt-oss-20b-hermes_agent-tool-finetune_4bit, is a 4-bit MXFP4 quantized version of OpenAI's gpt-oss-20b (a 21B-parameter Mixture-of-Experts model). Developed by Fesal Fayed, it is specifically fine-tuned for robust tool-use within the Hermes-Agent local agent framework. The finetune preserves the Harmony chat template and reasoning-effort knob, while significantly enhancing agentic capabilities.

Key Capabilities & Enhancements

Function-calling adherence: Improved reliability in generating correct JSON for tool calls without extraneous commentary.
Long agent loops: Excels in extended multi-turn interactions (10+ turns of tool → observe → plan).
System-prompt fidelity: Better adherence to role boundaries and refusal/allow-list rules defined in the system prompt.
Resource efficiency: The 4-bit MXFP4 quantization allows the model to fit within approximately 14-16 GB of VRAM, making it runnable on GPUs like a Colab T4 while retaining its full 8k context length.

Training Details

The model was trained using LoRA SFT (rank 64, alpha 16) on a single H100 GPU, utilizing ~42k tool-use traces from Hermes-Agent sessions. The training focused on successful tool calls and clean JSON, with an 8192 token length and assistant-only loss masking.

Limitations

Reasoning & Code: Math and code-generation capabilities are inherited from the base model and are not specifically optimized by this finetune.
Tool Over-calling: May over-call tools with vague instructions; users can mitigate this by adding specific instructions to the system prompt.
Language: English-only, as other languages were not included in the training data.
Safety: Safety-tuning is limited to what the base gpt-oss-20b provides.

Recommended Usage

This model is recommended for use with Unsloth or Transformers + bitsandbytes for optimal performance. It is particularly well-suited for developers building local agent applications that require reliable tool interaction and complex multi-step planning.

Overview

Model Overview

Key Capabilities & Enhancements

Training Details

Limitations

Recommended Usage

Full Model Card (README)