Paper ID: 2407.16593

A Comparative Study on Patient Language across Therapeutic Domains for Effective Patient Voice Classification in Online Health Discussions

Giorgos Lysandrou, Roma English Owen, Vanja Popovic, Grant Le Brun, Aryo Pradipta Gema, Beatrice Alex, Elizabeth A. L. Fairley

There exists an invisible barrier between healthcare professionals' perception of a patient's clinical experience and the reality. This barrier may be induced by the environment that hinders patients from sharing their experiences openly with healthcare professionals. As patients are observed to discuss and exchange knowledge more candidly on social media, valuable insights can be leveraged from these platforms. However, the abundance of non-patient posts on social media necessitates filtering out such irrelevant content to distinguish the genuine voices of patients, a task we refer to as patient voice classification. In this study, we analyse the importance of linguistic characteristics in accurately classifying patient voices. Our findings underscore the essential role of linguistic and statistical text similarity analysis in identifying common patterns among patient groups. These results allude to even starker differences in the way patients express themselves at a disease level and across various therapeutic domains. Additionally, we fine-tuned a pre-trained Language Model on the combined datasets with similar linguistic patterns, resulting in a highly accurate automatic patient voice classification. Being the pioneering study on the topic, our focus on extracting authentic patient experiences from social media stands as a crucial step towards advancing healthcare standards and fostering a patient-centric approach.

Submitted: Jul 23, 2024