Text Anonymization
Text anonymization aims to remove or obscure personally identifiable information (PII) from text data while preserving its utility for analysis or sharing. Current research heavily focuses on developing and evaluating methods robust against sophisticated re-identification attacks, particularly those leveraging the powerful inference capabilities of large language models (LLMs), employing architectures like transformers, LSTMs, and CRFs. This field is crucial for protecting individual privacy in various applications, from social science research to healthcare, and advancements are driving the development of more effective and privacy-preserving data processing techniques.
Papers
IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization
Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue Jiang, Xuebing Zhou
Comparing Feature-based and Context-aware Approaches to PII Generalization Level Prediction
Kailin Zhang, Xinying Qiu