Paper ID: 2310.19537

On consequences of finetuning on data with highly discriminative features

Wojciech Masarczyk, Tomasz Trzciński, Mateusz Ostaszewski

In the era of transfer learning, training neural networks from scratch is becoming obsolete. Transfer learning leverages prior knowledge for new tasks, conserving computational resources. While its advantages are well-documented, we uncover a notable drawback: networks tend to prioritize basic data patterns, forsaking valuable pre-learned features. We term this behavior "feature erosion" and analyze its impact on network performance and internal representations.

Submitted: Oct 30, 2023