Paper ID: 2407.10254

The Elephant in the Room -- Why AI Safety Demands Diverse Teams

David Rostcheck, Lara Scheibling

We consider that existing approaches to AI "safety" and "alignment" may not be using the most effective tools, teams, or approaches. We suggest that an alternative and better approach to the problem may be to treat alignment as a social science problem, since the social sciences enjoy a rich toolkit of models for understanding and aligning motivation and behavior, much of which could be repurposed to problems involving AI models, and enumerate reasons why this is so. We introduce an alternate alignment approach informed by social science tools and characterized by three steps: 1. defining a positive desired social outcome for human/AI collaboration as the goal or "North Star," 2. properly framing knowns and unknowns, and 3. forming diverse teams to investigate, observe, and navigate emerging challenges in alignment.

Submitted: May 7, 2024

Topics

Artificial Intelligence
Human Ai Collaboration
AI Safety
Social Science
ROOM Layout
Alignment Approach
Heterogeneous Team
Pink Elephant

Links

arXiv PDF