vectionlabs/Salience-1-9B

Hugging Face
VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 10, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Salience 1 (9B) by Vection Labs is a dense, 9-billion-parameter vision-language model built on the Qwen3-VL architecture, featuring a Qwen3-8B language model and a native vision encoder. It is specifically engineered for hard, practical work, excelling in code generation, agentic tasks, multi-step mathematical reasoning, and visual understanding over images and video. The model supports an extensive context window of up to 1 million tokens, making it suitable for processing large codebases or long documents. It is optimized for fast inference on modest hardware, including 2x T4 GPUs without GGUF, and is released under the Apache-2.0 license.

Loading preview...

Salience 1 (9B): A Multimodal Reasoning and Coding Model

Salience 1 (9B) is a 9-billion-parameter vision-language model developed by Vection Labs, designed for demanding technical tasks. It builds upon the Qwen3-VL architecture, integrating a 36-layer Qwen3-8B language model with a native vision encoder. A key differentiator is its focus on code and agentic work, making it highly effective for writing and debugging code, driving tools, and multi-step reasoning.

Key Capabilities

  • Code & Agentic First: Tuned to produce runnable code and well-formed tool calls, leveraging a coding/DevOps donor model.
  • Deep Reasoning: Provides structured, inspectable chains of thought for complex math, logic, and code problems.
  • Genuinely Multimodal: Processes both images and video as first-class inputs, not just for captioning, with a context window up to 1 million tokens via interleaved multimodal RoPE.
  • Efficient Performance: Optimized for fast inference on modest hardware, capable of running on 2x T4 GPUs (fp16 sharded) or a single T4 with 4-bit quantization, and features speculative decoding for 1.5-2.5x speedup.
  • Adaptive Thinking: Supports /no_think for direct answers and /think for deep, step-by-step reasoning, allowing users to control latency based on task complexity.

Intended Use Cases

  • Technical Assistance: Code generation, explanation, debugging, and repo-scale tasks.
  • Agentic Workflows: Generating structured tool calls for automated processes.
  • Quantitative Reasoning: Step-by-step mathematical and logical problem-solving.
  • Visual Understanding: Analyzing diagrams, charts, UI screenshots, and short video clips.

Salience 1 is released under the Apache-2.0 license and is designed for developers who prioritize functional output over conversational pleasantries.