Name: AMAImedia/Qwen3-8B-Nemotron-Orchestrator-NOESIS-BF16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AMAImedia

Overview

This model, AMAImedia/Qwen3-8B-Nemotron-Orchestrator-NOESIS-BF16, is an 8 billion parameter, BF16 reference checkpoint of the nvidia/Nemotron-Orchestrator-8B model. It is based on the Qwen3-8B underlying architecture, which is a dense, decoder-only transformer. Developed by AMAImedia as part of the NOESIS Professional Multilingual Dubbing Automation Platform, this release provides a bandwidth-friendly alternative to the original FP32 model.

Key Characteristics

Precision: BF16 (bfloat16) — a lossless cast from the original FP32, maintaining the same 8-bit exponent range as FP32.
Efficiency: Halves download bandwidth and disk footprint (~16 GB vs ~32 GB) compared to the FP32 version, and skips a slow load-time cast for users.
Architecture: Qwen3-8B, a dense (not MoE) decoder-only transformer.
Context: Inherits the capabilities of the Nemotron-Orchestrator-8B base model, which is designed for tool orchestration.
License: Inherits the NVIDIA Open Model License, designated "for research and development only."

Use Cases

Research and Development: Provides a clean BF16 baseline for experimenting with downstream quantization recipes and inference.
NOESIS Platform: Serves as the English orchestration teacher for NOESIS Specialist M9-ORCH-4B during knowledge distillation within the NOESIS multilingual dubbing automation platform.
Efficient Deployment: Ideal for users seeking reduced model size and faster loading times without sacrificing inference quality, especially on GPUs with 24 GB+ VRAM for full-resident inference.

Overview

Overview

Key Characteristics

Use Cases

Full Model Card (README)