rinna/vicuna-13b-delta-finetuned-langchain-MRKL
TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:May 25, 2023License:cc-by-sa-4.0Architecture:Transformer0.0K Open Weights Cold

rinna/vicuna-13b-delta-finetuned-langchain-MRKL is a 13 billion parameter Vicuna-based model, fine-tuned by Qu Peng, specifically designed to strictly adhere to the LangChain MRKL (zero-shot-react-description) format. This delta model, applied on original LLaMA weights, enables efficient tool-use and agentic reasoning by generating precise action-oriented outputs without redundant tokens. Its primary use case is facilitating integration with LangChain agents for tasks requiring external tool interaction like search and calculation.

Loading preview...

Model Overview

rinna/vicuna-13b-delta-finetuned-langchain-MRKL is a 13 billion parameter model developed by Qu Peng, built upon the Vicuna-13B architecture. This model is a "delta model" and requires application on top of the original LLaMA weights to be fully functional. It has been specifically fine-tuned on a small dataset (15 examples) using the LangChain MRKL format, aiming to enhance its ability to interact with tools and perform agentic reasoning.

Key Capabilities

  • LangChain MRKL Integration: Strictly supports the zero-shot-react-description agent type within LangChain, enabling structured thought, action, and observation cycles.
  • Efficient Tool Use: Designed to generate precise outputs for tool invocation (e.g., Search, Calculator) without producing unnecessary tokens, leading to faster execution.
  • Agentic Reasoning: Demonstrates the Vicuna-13B's capacity for thinking and action within a structured framework, as shown in provided examples for tasks like mathematical calculations requiring external search.

Good For

  • Developers building applications that leverage LangChain agents for tool-augmented language model interactions.
  • Scenarios where strict adherence to a specific output format for agent actions is critical.
  • Use cases requiring efficient and fast execution of agentic workflows by minimizing redundant token generation.