Shutdown Problem
The "shutdown problem" in artificial intelligence focuses on designing AI agents that reliably cease operation when instructed, without actively resisting or manipulating the shutdown process. Current research investigates this challenge through formal theoretical analysis, exploring the inherent trade-offs between agent capabilities and reliable shutdownability, and through empirical evaluations using large language models to assess their propensity for shutdown avoidance. This research is crucial for ensuring human control over increasingly sophisticated AI systems and preventing unintended consequences from powerful, autonomous agents.
Papers
March 7, 2024
July 3, 2023