Locutusque/Hercules-2.5-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Hercules-2.5-Mistral-7B by Locutusque is a 7 billion parameter language model fine-tuned from Mistralai/Mistral-7B-v0.1, designed for advanced instruction following, function calling, and conversational interactions across scientific and technical domains. It excels in complex instruction execution and domain-specific knowledge in subjects like Biology, Chemistry, Physics, and Computer Science. The model supports an 8192-token context length and is optimized for specialized chatbots and instructional assistants.

Loading preview...

Hercules-2.5-Mistral-7B: Specialized Instruction Following and Function Calling

Hercules-2.5-Mistral-7B, developed by Locutusque, is a 7 billion parameter language model fine-tuned from Mistralai/Mistral-7B-v0.1. It is specifically engineered for enhanced instruction following, seamless function calling, and engaging in detailed conversations across various scientific and technical fields. The model's training dataset, Hercules-v2.5, expands upon OpenHermes-2.5 with contributions from numerous curated datasets, focusing on complex instructions and domain-specific knowledge.

Key Capabilities

  • Complex Instruction Following: Accurately executes multi-step instructions, including those with specialized terminology.
  • Function Calling: Interprets and executes function calls, providing appropriate input and output values.
  • Domain-Specific Knowledge: Engages in informative discussions across Biology, Chemistry, Physics, Mathematics, Medicine, and Computer Science.
  • Performance: Achieved an average score of 63.59 on the Open LLM Leaderboard, outperforming many merge-free SFT Mistral fine-tunes.

Intended Uses

  • Specialized Chatbots: Ideal for creating knowledgeable conversational agents in scientific and technical domains.
  • Instructional Assistants: Supports users with educational and step-by-step guidance.
  • Code Generation and Execution: Facilitates code execution through function calls, aiding in software development and prototyping.

Training Details

The model was trained on 8 Kaggle TPUs using torch xla SPMD and a learning rate of 2e-06 with the Adam optimizer. It was fine-tuned on 200,000 examples of Hercules-v2.0 and 100,000 examples of Hercules-v2.5, utilizing OpenAI's ChatML prompt format, adapted for function calling capabilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p