Name: iaa01/CIA-1.7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: iaa01

Model Overview

iaa01/CIA-1.7B is a language model that has undergone reinforcement learning (RL) post-training using a unique reward mechanism called ∆Belief-RL. Unlike traditional RL methods that often rely on sparse success signals, this model is rewarded for actively reducing its own uncertainty over time. This provides a dense and continuous feedback loop, which is particularly beneficial for complex, multi-turn tasks requiring sustained information gathering.

Key Capabilities

Uncertainty Reduction: Optimized to minimize its own belief uncertainty, leading to more efficient information seeking.
Dense Feedback: Leverages the ∆Belief reward signal for continuous, turn-level credit assignment.
Long-Horizon Tasks: Designed to excel in open-ended, information-seeking scenarios that span multiple interactions.
Generalizable Strategies: Trained in a Twenty Questions environment, it develops information-seeking strategies that can generalize beyond its specific training context.

Good For

Applications requiring strategic information gathering.
Scenarios where reducing uncertainty is a primary objective.
Developing agents for interactive, query-based systems.
Research into novel reinforcement learning techniques for language models.