Paper ID: 2301.01948

Random forests, sound symbolism and Pokemon evolution

Alexander James Kilpatrick, Aleksandra Cwiek, Shigeto Kawahara

This study constructs machine learning algorithms that are trained to classify samples using sound symbolism, and then it reports on an experiment designed to measure their understanding against human participants. Random forests are trained using the names of Pokemon, which are fictional video game characters, and their evolutionary status. Pokemon undergo evolution when certain in-game conditions are met. Evolution changes the appearance, abilities, and names of Pokemon. In the first experiment, we train three random forests using the sounds that make up the names of Japanese, Chinese, and Korean Pokemon to classify Pokemon into pre-evolution and post-evolution categories. We then train a fourth random forest using the results of an elicitation experiment whereby Japanese participants named previously unseen Pokemon. In Experiment 2, we reproduce those random forests with name length as a feature and compare the performance of the random forests against humans in a classification experiment whereby Japanese participants classified the names elicited in Experiment 1 into pre-and post-evolution categories. Experiment 2 reveals an issue pertaining to overfitting in Experiment 1 which we resolve using a novel cross-validation method. The results show that the random forests are efficient learners of systematic sound-meaning correspondence patterns and can classify samples with greater accuracy than the human participants.

Submitted: Jan 5, 2023