driaforall/Dria-Agent-a-3B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Jan 9, 2025License:qwen-researchArchitecture:Transformer0.1K Warm

Dria-Agent-α-3B is a 3.1 billion parameter large language model developed by driaforall, fine-tuned from Qwen2.5-Coder-3B-Instruct. This model specializes in agentic applications, particularly excelling at Pythonic function calling for complex, multi-step interactions with tools. It supports one-shot parallel function calls and generates free-form reasoning traces, making it suitable for advanced automation and task execution.

Loading preview...

Dria-Agent-α-3B: Pythonic Function Calling for Agentic LLMs

Dria-Agent-α-3B is a 3.1 billion parameter model from driaforall, built upon the Qwen2.5-Coder-3B-Instruct architecture. It is specifically designed for agentic applications, focusing on a novel "Pythonic function calling" mechanism. This approach allows the model to interact with tools using blocks of Python code, offering significant advantages over traditional JSON-based methods.

Key Capabilities & Differentiators

  • Pythonic Function Calling: Utilizes Python code blocks for tool interaction, enabling more flexible and powerful agentic behavior.
  • One-shot Parallel Multiple Function Calls: Can execute multiple synchronous processes within a single chat turn, streamlining complex workflows that would typically require several conversational turns.
  • Free-form Reasoning and Actions: Generates natural language reasoning traces alongside actions embedded in python blocks, mitigating performance loss from rigid output formats.
  • On-the-fly Complex Solution Generation: Capable of implementing custom logic, conditionals, and synchronous pipelines within its generated Python code, allowing for sophisticated problem-solving.
  • Agent-Focused Design: Represents the first installment in a series of LLMs specifically optimized for agentic use cases.

Performance Highlights

Evaluated on the Berkeley Function Calling Leaderboard (BFCL), Dria-Agent-α-3B shows strong performance, particularly in "Non-Live Parallel Exec" (90.00%) and "Relevance Detection" (100.00%). On the Dria-Pythonic-Agent-Benchmark (DPAB), it achieves a score of 72, significantly outperforming its base model (26). While its MMLU-Pro score is 29.8, qualitative analysis suggests its Pythonic function calling capabilities might lead to an underestimation by standard MMLU-Pro evaluation scripts in STEM fields.

Ideal Use Cases

This model is particularly well-suited for applications requiring:

  • Complex Automation: Automating multi-step tasks that involve interacting with various tools and APIs.
  • Intelligent Agents: Developing agents that can reason, plan, and execute actions through code.
  • Dynamic Tool Use: Scenarios where the agent needs to generate custom logic or conditional flows for tool interaction.
  • Code Generation for Tool Orchestration: Generating Python code to orchestrate tool calls efficiently.