Name: ParrotRouter/Qwen3-4B-Instruct-2507-20250808-233922-0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ParrotRouter

Model Overview

ParrotRouter/Qwen3-4B-Instruct-2507-20250808-233922-0 is a 4 billion parameter model built upon the Qwen3-4B architecture. It was developed by ParrotRouter using a unique layer-wise merging technique, combining layers from various fine-tuned Qwen3-4B variants to achieve optimized performance on specific tasks.

Key Capabilities & Performance

This model demonstrates strong performance on academic benchmarks:

GPQA Diamond (0-shot): Achieves a score of 45.45% on graduate-level physics question answering, indicating its specialization in complex scientific reasoning.
MMLU (5-shot): Scores 72.51% across 57 subjects, showcasing its broad language understanding capabilities.

Unique Approach

What sets this model apart is its layer-wise merging process. Instead of traditional fine-tuning, individual transformer layers (0-35) are selected from different source models, and then combined. This allows for the integration of specialized knowledge and capabilities from multiple fine-tuned models into a single, optimized model. The non-layer weights (embeddings and final layers) are derived from the base Qwen3-4B model.

Intended Use Cases

This model is particularly well-suited for:

Graduate-level physics Q&A: Its primary optimization target makes it effective for complex scientific inquiries.
Research and experimentation: Ideal for exploring the effectiveness of layer-wise merging techniques and specialized model development.

Limitations

As an experimental merge, its performance may vary on tasks outside its specific optimization targets. Users should validate its suitability for their particular use case before deployment.

Overview

Model Overview

Key Capabilities & Performance

Unique Approach

Intended Use Cases

Limitations

Full Model Card (README)