Selection as Power: Constrained Reinforcement for Bounded...

Selection as Power: Constrained Reinforcement for Bounded Decision Authority

arXiv:2603.02019v1 Announce Type: cross Abstract: Selection as Power argued that upstream selection authority, rather than internal objective misalignment, constitutes a primary source of risk in high-stakes agentic systems. However, the original framework was static: governance constraints bounded selection power but did not adapt over time. In this work, we extend the framework to dynamic settings by introducing incentivized selection governance, where reinforcement updates are applied to scoring and reducer parameters under externally enforced sovereignty constraints. We formalize selection as a constrained reinforcement process in which parameter updates are projected onto governance-defined feasible sets, preventing concentration beyond prescribed bounds. Across multiple regulated financial scenarios, unconstrained reinforcement consistently collapses into deterministic dominance under repeated feedback, especially at higher learning rates. In contrast, incentivized governance enables adaptive improvement while maintaining bounded selection concentration. Projection-based constraints transform reinforcement from irreversible lock-in into controlled adaptation, with governance debt quantifying the tension between optimization pressure and authority bounds. These results demonstrate that learning dynamics can coexist with structural diversity when sovereignty constraints are enforced at every update step, offering a principled approach to integrating reinforcement into high-stakes agentic systems without surrendering bounded selection authority.

相关推荐

Mitigating topology biases in Graph Diffusion via Counterfactual Intervention

MAP-Diff: Multi-Anchor Guided Diffusion for Progressive 3D Whole-Body Low-Dose PET Denoising

Mitigating topology biases in Graph Diffusion via Counterfactual Intervention

LaSTR: Language-Driven Time-Series Segment Retrieval

CodecFlow: Efficient Bandwidth Extension via Conditional Flow Matching in Neural Codec Latent Space

MatRIS: Toward Reliable and Efficient Pretrained Machine Learning Interaction Potentials