Paper ID: 2112.07356

Technical Language Supervision for Intelligent Fault Diagnosis in Process Industry

Karl Löwenmark, Cees Taal, Stephan Schnabel, Marcus Liwicki, Fredrik Sandin

In the process industry, condition monitoring systems with automated fault diagnosis methods assist human experts and thereby improve maintenance efficiency, process sustainability, and workplace safety. Improving the automated fault diagnosis methods using data and machine learning-based models is a central aspect of intelligent fault diagnosis (IFD). A major challenge in IFD is to develop realistic datasets with accurate labels needed to train and validate models, and to transfer models trained with labeled lab data to heterogeneous process industry environments. However, fault descriptions and work-orders written by domain experts are increasingly digitised in modern condition monitoring systems, for example in the context of rotating equipment monitoring. Thus, domain-specific knowledge about fault characteristics and severities exists as technical language annotations in industrial datasets. Furthermore, recent advances in natural language processing enable weakly supervised model optimisation using natural language annotations, most notably in the form of natural language supervision (NLS). This creates a timely opportunity to develop technical language supervision (TLS) solutions for IFD systems grounded in industrial data, for example as a complement to pre-training with lab data to address problems like overfitting and inaccurate out-of-sample generalisation. We surveyed the literature and identify a considerable improvement in the maturity of NLS over the last two years, facilitating applications beyond natural language; a rapid development of weak supervision methods; and transfer learning as a current trend in IFD which can benefit from these developments. Finally we describe a general framework for TLS and implement a TLS case study based on SentenceBERT and contrastive learning based zero-shot inference on annotated industry data.

Submitted: Dec 11, 2021