Paper ID: 2111.14971

Classification of animal sounds in a hyperdiverse rainforest using Convolutional Neural Networks

Yuren Sun, Tatiana Midori Maeda, Claudia Solis-Lemus, Daniel Pimentel-Alarcon, Zuzana Burivalova

To protect tropical forest biodiversity, we need to be able to detect it reliably, cheaply, and at scale. Automated species detection from passively recorded soundscapes via machine-learning approaches is a promising technique towards this goal, but it is constrained by the necessity of large training data sets. Using soundscapes from a tropical forest in Borneo and a Convolutional Neural Network model (CNN) created with transfer learning, we investigate i) the minimum viable training data set size for accurate prediction of call types ('sonotypes'), and ii) the extent to which data augmentation can overcome the issue of small training data sets. We found that even relatively high sample sizes (> 80 per call type) lead to mediocre accuracy, which however improves significantly with data augmentation, including at extremely small sample sizes, regardless of taxonomic group or call characteristics. Our results suggest that transfer learning and data augmentation can make the use of CNNs to classify species' vocalizations feasible even for small soundscape-based projects with many rare species. Retraining our open-source model requires only basic programming skills which makes it possible for individual conservation initiatives to match their local context, in order to enable more evidence-informed management of biodiversity.

Submitted: Nov 29, 2021