ermiaazarkhalili/Llama-3-8B-Instruct_Function_Calling_xLAM

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jul 30, 2025License:llama3Architecture:Transformer Cold

The Llama-3-8B-Function-Calling-xLAM model by ermiaazarkhalili is an 8 billion parameter instruction-tuned causal language model, fine-tuned from Meta-Llama-3-8B-Instruct. It is specifically optimized for function calling tasks, leveraging Supervised Fine-Tuning (SFT) with LoRA adapters on the Salesforce/xlam-function-calling-60k dataset. This model is designed for efficient inference and is available in various formats, including GGUF quantizations, making it suitable for research and prototyping conversational AI with function calling capabilities.

Loading preview...

Overview

Llama-3-8B-Function-Calling-xLAM is an 8 billion parameter language model developed by ermiaazarkhalili, fine-tuned from Meta's Llama-3-8B-Instruct. Its primary focus is on function calling, achieved through Supervised Fine-Tuning (SFT) using LoRA (Low-Rank Adaptation) with 4-bit quantization. The model was trained on the Salesforce/xlam-function-calling-60k dataset, specifically designed for function calling tasks.

Key Capabilities

  • Function Calling Optimization: Specifically fine-tuned to understand and generate function calls based on user prompts.
  • Efficient Training: Utilizes LoRA with 4-bit NF4 quantization for efficient adaptation of the base model.
  • Inference Flexibility: Available in multiple formats, including GGUF quantizations, for deployment on various hardware, including CPU.
  • Base Model Strength: Benefits from the robust architecture and pre-training of the Meta-Llama-3-8B-Instruct model.

Good For

  • Research: Ideal for studying language model fine-tuning techniques, particularly for function calling.
  • Prototyping Conversational AI: Suitable for developing and testing conversational agents that require tool use or function invocation.
  • Educational Purposes: Can be used for learning about SFT, LoRA, and function calling implementations in LLMs.
  • Personal Projects: Applicable for various personal AI projects where function calling is a core requirement.

Limitations

  • Context Length: Fine-tuned with a 2,048 token context limit.
  • Language: Primarily trained on English data.
  • Safety: Not extensively safety-tuned; requires additional guardrails for production use.
  • Knowledge Cutoff: Inherits the knowledge cutoff of its base model.