Name: artificialguybr/Qwen2.5-0.5B-OpenHermes2.5 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: artificialguybr

Model Overview

This model, artificialguybr/Qwen2.5-0.5B-OpenHermes2.5, is a fine-tuned version of the Qwen2.5-0.5B base model, developed by artificialguybr. It leverages the Qwen2.5 architecture, which introduces significant advancements over previous Qwen iterations, and has been specifically trained on the high-quality OpenHermes 2.5 dataset.

Key Capabilities & Features

Enhanced Instruction Following: Improved ability to understand and execute instructions.
Long Text Generation: Capable of generating long texts, with support for up to 8K tokens in output and a 32,768 token context length.
Structured Output: Better at generating structured data, particularly JSON.
Multilingual Support: Supports over 29 languages.
Robustness: Increased resilience to diverse system prompts.
Base Architecture: Utilizes a Transformer architecture with RoPE, SwiGLU, RMSNorm, and Attention QKV bias.

Training Details

The model was fine-tuned using the Axolotl framework on the OpenHermes 2.5 dataset. This dataset comprises 1 million primarily synthetically generated instruction and chat samples, known for contributing to state-of-the-art LLM development. Training involved a learning rate of 1e-05, a batch size of 5, and 3 epochs, with gradient checkpointing and BF16 mixed precision enabled.

Intended Uses

This model is designed for research and application in natural language processing tasks, including text generation and language understanding. It can serve as a foundation for conversational AI after further fine-tuning (e.g., SFT or RLHF). Users should be aware of potential biases from the training data and that direct conversational use without additional fine-tuning is not recommended.

Overview

Model Overview

Key Capabilities & Features

Training Details

Intended Uses

Full Model Card (README)