Paper ID: 2403.15615

NaturalTurn: A Method to Segment Transcripts into Naturalistic Conversational Turns

Gus Cooney, Andrew Reece

Conversation is the subject of increasing interest in the social, cognitive, and computational sciences. And yet, as conversational datasets continue to increase in size and complexity, researchers lack scalable methods to segment speech-to-text transcripts into conversational turns--the basic building blocks of social interaction. We introduce "NaturalTurn," a turn segmentation algorithm designed to accurately capture the dynamics of naturalistic exchange. NaturalTurn operates by distinguishing speakers' primary conversational turns from listeners' secondary utterances, such as backchannels, brief interjections, and other forms of parallel speech that characterize conversation. Using data from a large conversation corpus, we show how NaturalTurn-derived transcripts demonstrate favorable statistical and inferential characteristics compared to transcripts derived from existing methods. The NaturalTurn algorithm represents an improvement in machine-generated transcript processing methods, or "turn models" that will enable researchers to associate turn-taking dynamics with the broader outcomes that result from social interaction, a central goal of conversation science.

Submitted: Mar 22, 2024