GestaltLabs/Ornstein3.6-35B-A3B
Ornstein3.6-35B-A3B by GestaltLabs is a multimodal fine-tune of the Qwen 3.6 35B-A3B Mixture-of-Experts (MoE) model, featuring 34.66 billion total parameters with approximately 3 billion active per token and a 262,144-token context length. Developed by DJLougen, this model is part of the Ornstein series, specifically optimized for reasoning and agent-oriented tasks through a custom data curation pipeline. It supports image-text-to-text conditional generation, making it suitable for complex multimodal applications requiring advanced reasoning.
Loading preview...
Ornstein3.6-35B-A3B: Multimodal Reasoning MoE
Ornstein3.6-35B-A3B is a multimodal fine-tune of the Qwen 3.6 35B-A3B Mixture-of-Experts (MoE) model, developed by DJLougen. This model is part of the "Ornstein" series, which focuses on reasoning- and agent-oriented fine-tunes built upon a custom data curation pipeline. It leverages a Qwen 3.6 MoE architecture with 34.66 billion total parameters, where approximately 3 billion are active per token, and boasts an extensive context length of 262,144 tokens.
Key Capabilities
- Multimodal Conditional Generation: The model includes the Qwen3.6 base visual tower and image/video processor files, enabling it to handle image-text-to-text tasks.
- Mixture-of-Experts Architecture: Utilizes a Qwen 3.6 MoE with linear + full attention interleaved (Gated Delta Net), featuring 256 experts with 8 active per token, enhancing efficiency and performance.
- Reasoning and Agent-Oriented Tasks: Specifically fine-tuned for applications requiring advanced reasoning and agentic behaviors.
- High Context Length: Supports a substantial context window of 262,144 tokens, beneficial for processing long inputs and complex interactions.
Good For
- Developers working on multimodal AI applications that require processing both text and image inputs.
- Use cases demanding strong reasoning capabilities and agent-like intelligence.
- Applications benefiting from a large context window for intricate problem-solving or extended dialogues.
- Users seeking an efficient MoE model for conditional generation tasks.