Growing a Brain with Sparsity-Inducing Generation for Continual Learning

Hyundong Jin¹, Gyeong-hyeon Kim¹, Chanho Ahn^2* and Eunwoo Kim¹

¹School of Computer Science and Engineering, Chung-Ang University
²Samsung Advanced Institute of Technology (SAIT)
^**This work was done independently, without any support from SAIT.

{jude0316, leonardkkh, eunwoo}@cau.ac.kr, chanho.ahn@samsung.com

ICCV 2023

Paper Code

An overview of GrowBrain, which evolves old knowledge through sparsity-inducing parameter generation. For each task, a hypernetwork generates a task- and layer-conditioned sparse parameter set that transforms the previously learned weights, enabling adaptive reuse of past knowledge without interference.

Motivation:
In continual learning, parameter isolation methods prevent forgetting by freezing disjoint parameter subsets for each task. However, this strategy assumes that old parameters are always reusable, which limits their adaptability to new tasks. We argue that merely reusing fixed parameters from prior tasks is insufficient, especially when tasks exhibit distributional shifts. The core question we raise is: in parameter isolation-based continual learning, is it possible to transform the knowledge from old tasks to fit newer tasks better?

Proposed Method:
We propose GrowBrain, a novel continual learning framework that evolves old knowledge through sparsity-inducing parameter generation. Our method consists of a base network and a hypernetwork. The hypernetwork receives a task token and layer embeddings and generates a transformed parameter set for each new task. By introducing a sparsity loss based on loss difference with/without each parameter, GrowBrain selectively generates only essential updates to the parameter space, avoiding redundancy. This results in an evolved, task-adaptive version of previously learned knowledge. Additionally, a reconstruction loss encourages consistency across tasks.

Experimental Results:
We evaluate GrowBrain on both image classification and video action recognition benchmarks, including ImageNet, CUBS, Stanford Cars, Flowers, Wikiart, Sketch, ActivityNet, and UCF-101. GrowBrain consistently outperforms strong continual learning baselines by achieving higher accuracy while maintaining compact parameter growth. Ablation studies confirm that sparsity-inducing generation and task-layer–conditioned parameter evolution are essential for preventing forgetting and enabling efficient adaptation.

Comparison of GrowBrain against prior methods on large-scale image classification benchmarks (ImageNet, CUBS, Stanford Cars, Flowers, Wikiart, Sketch). GrowBrain achieves strong performance with minimal parameter overhead by evolving old knowledge. The bottom figures show class-incremental accuracy trends as the number of seen classes increases. GrowBrain consistently sustains higher performance under longer task sequences.

We evaluate class-incremental learning performance on UCF-101 and ActivityNet by measuring the Avg and Last accuracies at steps T⁵ and T¹⁰. GrowBrain achieves the highest accuracy across all metrics, outperforming regularization-based and parameter-isolation methods. On UCF-101, our method shows a significant gain in Last accuracy over H², WSN, PackNet, and Piggyback, while maintaining low parameter usage. On ActivityNet, where limited data per class increases reliance on prior knowledge, GrowBrain demonstrates strong generalization by effectively evolving old parameters, surpassing all baselines by large margins.

We evaluate task‑incremental learning on image classification benchmarks (ImageNet + FG, CIFAR‑100) and video action recognition benchmarks (UCF‑101, ActivityNet). The table reports the average accuracy across tasks and the number of parameters used per method. GrowBrain achieves the highest accuracy on every dataset, while using the lowest or comparable number of parameters. These results illustrate its capacity to evolve prior knowledge effectively and scale across both spatial (image) and temporal (video) continual learning scenarios.

Growing a Brain with Sparsity-Inducing Generation for Continual Learning

Abstract

BibTeX