Locutusque/Hercules-3.1-Mistral-7B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 19, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Locutusque/Hercules-3.1-Mistral-7B is a 7 billion parameter language model fine-tuned from Mistralai/Mistral-7B-v0.1. It is specifically designed for enhanced instruction following, function calling, and conversational interactions across scientific and technical domains. This model excels in understanding complex instructions and executing function calls, making it suitable for specialized chatbots and instructional assistants. It supports a context length of 8192 tokens.

Loading preview...

Hercules-3.1-Mistral-7B: Specialized Instruction Following and Function Calling

Hercule-3.1-Mistral-7B is a 7 billion parameter language model, fine-tuned from Mistralai/Mistral-7B-v0.1, with a focus on advanced instruction following, function calling, and domain-specific conversations. It leverages the Hercules-v3.0 dataset, an expansion of OpenHermes-2.5, incorporating diverse curated datasets to enhance its capabilities.

Key Capabilities

  • Complex Instruction Following: Accurately executes multi-step instructions, including those with specialized terminology.
  • Function Calling: Seamlessly interprets and executes function calls, providing appropriate input and output values.
  • Domain-Specific Knowledge: Engages in informative conversations across Biology, Chemistry, Physics, Mathematics, Medicine, and Computer Science.
  • Code Generation and Execution: Facilitates code execution through function calls, aiding in software development and prototyping.

Training Details

The model was fine-tuned on 700,000 examples from the Hercules-v3.0 dataset using 8 Kaggle TPUs. It employs a learning rate of 2e-06 with the Adam optimizer and a linear scheduler. The training utilized bfloat16 precision and followed the OpenAI ChatML prompt format, adapted for function calling capabilities.

Performance

Evaluations on the Open LLM Leaderboard show an average score of 62.09. Notable scores include 83.55 on HellaSwag (10-Shot) and 63.65 on MMLU (5-Shot).

Intended Uses

  • Specialized Chatbots: Ideal for creating knowledgeable conversational agents in scientific and technical fields.
  • Instructional Assistants: Supports users with educational and step-by-step guidance across various disciplines.
  • Software Development: Aids in code generation and execution through its function calling abilities.

Limitations

Users should be aware of potential biases from underlying data sources, the risk of hallucinations or factual errors in specialized domains, and the possibility of misuse due to its technical conversation and function execution capabilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p