Paper ID: 2302.10983

Do Orcas Have Semantic Language? Machine Learning to Predict Orca Behaviors Using Partially Labeled Vocalization Data

Sophia Sandholm

Orcinus orca (killer whales) exhibit complex calls. They last about a second. In a call, an orca typically uses multiple frequencies simultaneously, varies the frequencies, and varies their volumes. Behavior data is hard to obtain because orcas live under water and travel quickly. Sound data is relatively easy to capture. As a science goal, we would like to know whether orca vocalizations constitute a semantic language. We do this by studying whether machine learning can predict behavior from vocalizations. Such prediction would also help scientific research and safety applications because one would like to predict behavior while only having to capture sound. A significant challenge in this process is lack of labeled data. We work with recent recordings of McMurdo Sound orcas [Wellard et al. 2020] where each recording is labeled with the behaviors observed during the recording. This yields a dataset where sound segments - continuous vocalizations that can be thought of as call sequences or more general structures - within the recordings are labeled with superfluous behaviors. Despite that, with a careful combination of recent machine learning techniques, we achieve 96.4% classification accuracy. This suggests that orcas do use a semantic language. It is also promising for research and applications.

Submitted: Jan 28, 2023