Paper ID: 2210.01487

SwarMan: Anthropomorphic Swarm of Drones Avatar with Body Tracking and Deep Learning-Based Gesture Recognition

Ahmed Baza, Ayush Gupta, Ekaterina Dorzhieva, Aleksey Fedoseev, Dzmitry Tsetserukou

Anthropomorphic robot avatars present a conceptually novel approach to remote affective communication, allowing people across the world a wider specter of emotional and social exchanges over traditional 2D and 3D image data. However, there are several limitations of current telepresence robots, such as the high weight, complexity of the system that prevents its fast deployment, and the limited workspace of the avatars mounted on either static or wheeled mobile platforms. In this paper, we present a novel concept of telecommunication through a robot avatar based on an anthropomorphic swarm of drones; SwarMan. The developed system consists of nine nanocopters controlled remotely by the operator through a gesture recognition interface. SwarMan allows operators to communicate by directly following their motions and by recognizing one of the prerecorded emotional patterns, thus rendering the captured emotion as illumination on the drones. The LSTM MediaPipe network was trained on a collected dataset of 600 short videos with five emotional gestures. The accuracy of achieved emotion recognition was 97% on the test dataset. As communication through the swarm avatar significantly changes the visual appearance of the operator, we investigated the ability of the users to recognize and respond to emotions performed by the swarm of drones. The experimental results revealed a high consistency between the users in rating emotions. Additionally, users indicated low physical demand (2.25 on the Likert scale) and were satisfied with their performance (1.38 on the Likert scale) when communicating by the SwarMan interface.

Submitted: Oct 4, 2022