ermiaazarkhalili/VibeThinker-3B-Function-Calling-xLAM-Unsloth

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 21, 2026License:mitArchitecture:Transformer0.0K Open Weights Warm

The ermiaazarkhalili/VibeThinker-3B-Function-Calling-xLAM-Unsloth model is a 3.1 billion parameter language model, fine-tuned from WeiboAI's VibeThinker-3B. It is specifically optimized for function calling tasks, leveraging the Salesforce/xlam-function-calling-60k dataset. This model utilizes Unsloth for efficient training, resulting in faster fine-tuning and reduced VRAM consumption, making it suitable for integrating external tools and APIs.

Loading preview...

Overview

This model, developed by ermiaazarkhalili, is a fine-tuned version of the 3.1 billion parameter VibeThinker-3B base model. Its primary specialization is function calling, achieved through supervised fine-tuning (SFT) with QLoRA (4-bit) on the comprehensive Salesforce/xlam-function-calling-60k dataset, which contains 60,000 examples of queries, tool definitions, and structured answers.

Key Capabilities

  • Function Calling: Excels at interpreting natural language requests and generating structured function calls to interact with external tools and APIs.
  • Efficient Training: Fine-tuned using Unsloth, which enabled 2x faster training and 60% less VRAM usage compared to standard methods.
  • Small Footprint: At 3.1 billion parameters, it offers a capable solution for function calling in resource-constrained environments.
  • Quantized Versions: Available in GGUF formats (Q4_K_M, Q5_K_M, Q8_0) for CPU and edge device inference, supporting platforms like Ollama and llama.cpp.

Good For

  • Tool Integration: Developing applications that require an LLM to interact with external APIs or services based on user prompts.
  • Resource-Constrained Deployment: Deploying function-calling capabilities where computational resources (GPU memory, inference speed) are limited.
  • Prototyping: Quickly building and testing function-calling agents due to its efficient training and smaller size.

Limitations

  • Context Length: Fine-tuned with a 2,048 token context window.
  • Language: Primarily trained on English data.
  • Safety: Not extensively safety-tuned; requires external guardrails for sensitive applications.