Black Box Watermarking

Black-box watermarking aims to embed ownership information into machine learning models, particularly deep neural networks and large language models, without requiring access to their internal parameters. Current research focuses on developing robust watermarking techniques resistant to various attacks, including model modification, data poisoning, and paraphrasing, often employing methods like backdooring or post-hoc insertion of semantic signatures. This field is crucial for protecting intellectual property in the rapidly expanding AI landscape, enabling verifiable ownership and deterring unauthorized model replication and distribution.

Papers