Neurocoder: General-Purpose Computation Using Stored Neural Programs

Published in ICML (Spotlight), 2022

Artificial Neural Networks are functionally equivalent to special-purpose computers. Their inter-neuronal connection weights represent the learnt Neural Program that instructs the networks on how to compute the data. However, without storing Neural Programs, they are restricted to only one, overwriting learnt programs when trained on new data. Here we design Neurocoder, a new class of general-purpose neural networks in which the neural network “codes” itself in a data-responsive way by composing relevant programs from a set of shareable, modular programs stored in external memory. This time, a Neural Program is efficiently treated as data in memory. Integrating Neurocoder into current neural architectures, we demonstrate new capacity to learn modular programs, reuse simple programs to build complex ones, handle pattern shifts and remember old programs as new ones are learnt, and show substantial performance improvement in solving object recognition, playing video games and continual learning tasks.

Code•PDF•Link•Slides

ncoder

Neurocoder (a) The Main Network uses a working program to compute the output for the input. Here only the final layer of the Main Network is adaptively loaded with the working program (1). Other layers use traditional Neural Programs as connection weights (fixed-after-training). (b) The Program Controller’s composition network controls access to the Program Memory, emitting queries and interpolating gate control signals in response to the input (2). It then performs recurrent multi-head program attention to the Program Status (3), triggering attention weights to the Singular Programs (4). The attended Singular Programs form an active program using low-rank approximation (5). This active program is then used to derive the working program from a residual program produced by the Program Controller’s integration network ( 6). (c) The Program Memory stores the representations (singular programs) required to reconstruct the active program to be used by the Program Controller. Access is controlled through the Program Status including keys (k), and slot usage (m) that are updated during the training and computation (7).

Share on

Twitter Facebook LinkedIn