Paper ID: 2211.15532

YZR-net : Self-supervised Hidden representations Invariant to Transformations for profanity detection

Vedant Sandeep Joshi, Sivanagaraja Tatinati, Yubo Wang

On current {\it e-}learning platforms, live classes are an important tool that provides students with an opportunity to get more involved while learning new concepts. In such classes, the element of interaction with teachers and fellow peers helps in removing learning silos and gives each student a chance to experience some aspects relevant to offline learning in this era of virtual classes. One common way of interaction in a class is through the chats / messaging framework, where the teacher can broadcast messages as well as get instant feedback from the students in the live class. This freedom of interaction is a crucial aspect for any student's learning growth but misuse of it can have serious repercussions. Some miscreants use this framework to send profane messages which can have a negative impact on other students as well as the teacher of the class. These rare but high impact situations obviate the need for automatic detection mechanisms that prevent the posting of such chats on any platform. In this work we develop YZR-Net which is a self-supervised framework that is able to robustly detect profane words used in a chat even if the student tries to add clever modifications to fool the system. The matching mechanism on token / word level allows us to maintain a compact as well as dynamic profane vocabulary which can be updated without retraining the underlying model. Our profanity detection framework is language independent and can handle abuses in both English as well as its transliterated counterpart Hinglish (Hindi language words written in English).

Submitted: Nov 22, 2022