Name: M4-ai/Hercules-5.0-Qwen2-1.5B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: M4-ai

Hercules-5.0-Qwen2-1.5B Overview

M4-ai/Hercules-5.0-Qwen2-1.5B is a 1.5 billion parameter language model developed by M4-ai, fine-tuned from the Qwen2-1.5B architecture. It is designed as a general-purpose assistant, leveraging a high-quality mixed dataset for its fine-tuning process. The model uses the ChatML prompt format.

Key Capabilities

General-purpose assistance: Excels in a broad range of tasks.
Mathematical reasoning: Demonstrates capabilities in math-related problems.
Code generation: Proficient in coding tasks.
Writing assistance: Capable of generating and assisting with written content.
Question Answering: Effective for direct question answering.
Chain-of-Thought: Supports complex reasoning processes.

Training Details

The model was fine-tuned using the Locutusque/hercules-v5.0 dataset. Training was conducted using bf16 non-mixed precision on 8 Kaggle TPUs, with a global batch size of 256 and a sequence length of 1536. The developers plan to release a DPO (Direct Preference Optimization) version in the future.

Licensing and Language

Hercules-5.0-Qwen2-1.5B is released under the Apache-2.0 license. Its primary language is English, with potential capabilities in Chinese.