Paper ID: 2310.19537
On consequences of finetuning on data with highly discriminative features
Wojciech Masarczyk, Tomasz Trzciński, Mateusz Ostaszewski
In the era of transfer learning, training neural networks from scratch is becoming obsolete. Transfer learning leverages prior knowledge for new tasks, conserving computational resources. While its advantages are well-documented, we uncover a notable drawback: networks tend to prioritize basic data patterns, forsaking valuable pre-learned features. We term this behavior "feature erosion" and analyze its impact on network performance and internal representations.
Submitted: Oct 30, 2023