NexaAI/Octopus-v2
NexaAI/Octopus-v2 is a 2 billion parameter language model developed by Nexa AI, specifically engineered for efficient on-device function calling. It utilizes a unique functional token strategy for training and inference, enabling high accuracy and significantly faster inference speeds compared to RAG-based methods and even GPT-4. This model excels at generating individual, nested, and parallel function calls, making it ideal for Android API orchestration and edge computing applications.
Loading preview...
Octopus-v2: On-Device Function Calling Language Model
Nexa AI's Octopus-v2 is a 2 billion parameter language model designed for highly efficient on-device function calling, particularly for Android APIs. It introduces a novel functional token strategy that optimizes both training and inference, allowing it to achieve performance comparable to larger models like GPT-4 while operating at significantly higher speeds.
Key Capabilities:
- Exceptional Inference Speed: Outperforms "Llama7B + RAG solution" by 36X on an A100 GPU and is 168% faster than GPT-4-turbo, attributed to its functional token design.
- High Function Call Accuracy: Achieves 98-100% accuracy, surpassing "Llama7B + RAG solution" by 31% and matching GPT-4 and RAG + GPT-3.5.
- Versatile Function Calling: Capable of generating individual, nested, and parallel function calls across complex scenarios.
- On-Device Optimization: Engineered for seamless operation on Android devices, supporting applications from system management to multi-device orchestration.
Good For:
- Developers building AI agents for edge computing and Android applications requiring fast and accurate function calling.
- Use cases where efficient execution of Android APIs is critical, such as smart device control or specialized mobile applications.
- Scenarios demanding high function call accuracy with minimal latency on resource-constrained devices.