Knowledge Internalization

Knowledge internalization explores how individuals and systems integrate external information into their internal representations, enabling autonomous behavior consistent with learned values or knowledge. Current research focuses on developing computational models, such as those employing contrastive learning and internal reward systems, to understand how this process occurs in both humans and artificial agents, particularly addressing challenges like "reward hacking" and superficial knowledge integration in dialogue generation. This research is significant for advancing our understanding of social learning, value acquisition, and the development of more human-aligned artificial intelligence systems, as well as shedding light on the evolution of cooperation within groups.

Papers