Text Sanitization

Text sanitization aims to remove or modify personally identifiable information from text data while preserving its utility for downstream tasks. Current research focuses on developing more sophisticated methods beyond simple keyword replacement, leveraging large language models and differential privacy techniques to achieve better privacy-utility trade-offs. These advancements are crucial for protecting sensitive information in various applications, including whistleblowing, data sharing, and compliance with privacy regulations, by enabling the release of anonymized data while minimizing re-identification risk. The field is actively exploring methods that consider contextual information and writing style to enhance the effectiveness of sanitization.

Papers