Name: laion/100k_baseline__Qwen3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: laion

Overview

This model, laion/100k_baseline__Qwen3-8B, is an 8 billion parameter language model derived from the Qwen3-8B architecture. It features a substantial context length of 32768 tokens, enabling it to process and understand extensive inputs. The model has undergone fine-tuning on a specialized collection of datasets, primarily consisting of agent-based interaction traces from various simulated environments. These datasets include traces from swesmith-sandboxes-with_tests, r2egym experiments (including askllm-hardened and constrained variants), tas_optimal_combined_traces, and Kimi-K2T-swesmith.

Key Capabilities

Agentic Reasoning: Fine-tuned on diverse agent interaction traces, suggesting capabilities in understanding and generating responses for complex, multi-step tasks within simulated environments.
Extended Context Handling: With a 32768-token context window, it can process and maintain coherence over long dialogues or detailed problem descriptions.

Training Details

The model was trained with a learning rate of 4e-05 over 7 epochs, utilizing a distributed setup across 128 devices. The training employed an AdamW optimizer with specific beta and epsilon parameters, and a cosine learning rate scheduler with a 0.1 warmup ratio.

Good For

Developing and evaluating AI agents in simulated environments.
Tasks requiring understanding of complex interaction logs and decision-making processes.
Applications benefiting from a large context window for detailed problem-solving.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)