Name: Locutusque/Hercules-4.0-Yi-34B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Locutusque

Hercules-4.0-Yi-34B: Specialized Instruction Following and Function Calling

Hercule-4.0-Yi-34B is a 34 billion parameter language model, fine-tuned from the Yi-34B base model by Locutusque. It is specifically designed to enhance performance in complex instruction following, function calling, and engaging in detailed conversations within scientific and technical fields. The model's training utilized the Hercules-v4.0 dataset, which expands upon OpenHermes-2.5 with additional curated data.

Key Capabilities

Complex Instruction Following: Accurately executes multi-step instructions, including those with specialized terminology.
Function Calling: Interprets and executes function calls, providing appropriate input and output values.
Domain-Specific Knowledge: Engages in informative discussions across Biology, Chemistry, Physics, Mathematics, Medicine, and Computer Science.

Intended Uses

Specialized Chatbots: Ideal for creating knowledgeable conversational agents in scientific and technical domains.
Instructional Assistants: Supports users with educational and step-by-step guidance.
Code Generation and Execution: Facilitates code execution through function calls, aiding in software development.

Training Details

The model was fine-tuned on 75,000 examples of the Hercules-v4.0 dataset using 8 Kaggle TPUs. It employed a learning rate of 1e-4 with bfloat16 precision and a total batch size of 64. LoRA was used to freeze approximately 97% of the model parameters. The model is trained to use OpenAI's ChatML prompt format, adapted for function calling capabilities.

Limitations

Users should be aware of potential biases from underlying data sources, the risk of hallucinations or factual errors, and the possibility of misuse due due to its technical conversation and function call abilities.

Overview

Hercules-4.0-Yi-34B: Specialized Instruction Following and Function Calling

Key Capabilities

Intended Uses

Training Details

Limitations

Full Model Card (README)