Characterizing catastrophic forgetting via the Neural Tangent Kernel
Published:
Neural networks have achieved near optimal performance for supervised learning tasks.However when facing a sequence of tasks where data distribution is changing over time, they tend to forget what has been learned in the past leading to Catastrophic Forgetting (CF). This is one critical problem Continual Learning (CL) aims to solve. Although, there has been plenty of empirical works trying to study that pathology, very few tackled it from a theoretical side. In this work, we provide a theoretical analysis of CF under the Neural Tangent Kernel (NTK) regime where neural networks behave linearly.
A recent blog post by the leading author takes a deep dive into the roots of catastorphic forgetting for supervised learning. It can be found here.