Growing a Brain with Sparsity-Inducing Generation for Continual Learning

Hyundong Jin1, Gyeong-hyeon Kim1, Chanho Ahn2* and Eunwoo Kim1
1School of Computer Science and Engineering, Chung-Ang University
2Samsung Advanced Institute of Technology (SAIT)
**This work was done independently, without any support from SAIT.
{jude0316, leonardkkh, eunwoo}@cau.ac.kr, chanho.ahn@samsung.com
ICCV 2023
MY ALT TEXT

An overview of GrowBrain, which evolves old knowledge through sparsity-inducing parameter generation. For each task, a hypernetwork generates a task- and layer-conditioned sparse parameter set that transforms the previously learned weights, enabling adaptive reuse of past knowledge without interference.

Abstract

Deep neural networks suffer from catastrophic forgetting in continual learning, where they tend to lose information about previously learned tasks when optimizing a new incoming task. Recent strategies isolate the important parameters for previous tasks to retain old knowledge while learning the new task. However, using the fixed old knowledge might act as an obstacle to capturing novel representations. To overcome this limitation, we propose a framework that evolves the previously allocated parameters by absorbing the knowledge of the new task. The approach performs under two different networks. The base network learns knowledge of sequential tasks, and the sparsity-inducing hypernetwork generates parameters for each time step for evolving old knowledge. The generated parameters transform old parameters of the base network to reflect the new knowledge. We design the hypernetwork to generate sparse parameters conditional to the task-specific information and the structural information of the base network. We evaluate the proposed approach on class-incremental and taskincremental learning scenarios for image classification and video action recognition tasks. Experimental results show that the proposed method consistently outperforms a large variety of continual learning approaches for those scenarios by evolving old knowledge.

Motivation:
In continual learning, parameter isolation methods prevent forgetting by freezing disjoint parameter subsets for each task. However, this strategy assumes that old parameters are always reusable, which limits their adaptability to new tasks. We argue that merely reusing fixed parameters from prior tasks is insufficient, especially when tasks exhibit distributional shifts. The core question we raise is: in parameter isolation-based continual learning, is it possible to transform the knowledge from old tasks to fit newer tasks better?

GrowBrain method overview

Proposed Method:
We propose GrowBrain, a novel continual learning framework that evolves old knowledge through sparsity-inducing parameter generation. Our method consists of a base network and a hypernetwork. The hypernetwork receives a task token and layer embeddings and generates a transformed parameter set for each new task. By introducing a sparsity loss based on loss difference with/without each parameter, GrowBrain selectively generates only essential updates to the parameter space, avoiding redundancy. This results in an evolved, task-adaptive version of previously learned knowledge. Additionally, a reconstruction loss encourages consistency across tasks.

Experimental Results:
We evaluate GrowBrain on both image classification and video action recognition benchmarks, including ImageNet, CUBS, Stanford Cars, Flowers, Wikiart, Sketch, ActivityNet, and UCF-101. GrowBrain consistently outperforms strong continual learning baselines by achieving higher accuracy while maintaining compact parameter growth. Ablation studies confirm that sparsity-inducing generation and task-layer–conditioned parameter evolution are essential for preventing forgetting and enabling efficient adaptation.

BibTeX


@inproceedings{jin2023growing,
  title={Growing a brain with sparsity-inducing generation for continual learning},
  author={Jin, Hyundong and Kim, Gyeong-hyeon and Ahn, Chanho and Kim, Eunwoo},
  booktitle={Proceedings of the IEEE/CVF international conference on computer vision},
  pages={18961--18970},
  year={2023}
}