Date Submitted: October 1, 2024
Journal/Venue: ICLR 2025 (under review)
Satchel Grant, Noah D. Goodman, James L. McClelland
Abstract:
Symbolic programs, defined by discrete variables with
explicit rules and relations, often have the benefit
of interpretability, ease of communication, and
generalization. This is contrasted against neural
systems, consisting of distributed representations with
rules and relations defined by learned parameters, which
often have opaque inner mechanisms. There is an interest
in finding unity between these two types of systems for
cognitive and computer scientists alike. There is no
guarantee, however, that these two types of systems are
reconcilable. To what degree do neural networks induce
abstract, mutable, slot-like variables in order to
achieve next-token prediction (NTP) goals? Can neural
functions be thought of analogously to a computer
program? In this work, we train neural systems using
NTP on numeric cognitive tasks and then seek to
understand them at the level of symbolic programs. We
use a combination of causal interventions and visualization
methods in pursuit of this goal. We find that models
of sufficient dimensionality do indeed develop strong
analogs of symbolic algorithms purely from the NTP
objective. We then ask how variations on the tasks and
model architectures affect the models' learned solutions
to find that numeric symbols are not formed for every
variant of the task, and transformers solve the problem
in a different fashion than their recurrent counterparts.
Lastly, we show that in all cases, some degree of
gradience exists in the neural symbols, highlighting the
difficulty of finding simple, interpretable symbolic
stories of how neural networks perform their tasks.
Taken together, our results are consistent with the
view that neural networks can approximate interpretable
symbolic programs of number cognition, but the particular
program they approximate and the extent to which they
approximate it can vary widely, depending on the network
architecture, training data, extent of training, and
network size.
This is the continuation of an ICLR Re-Align 2024 Workshop
paper linked below. This work provides a good demonstration
of what types of distributed representations can form
as a function of next-token prediction,
and how these representations can change depending on
architectural choices such as size or attention vs
recurrence. This is hopefully the first of many of my
projects that use causal analyses to address claims
about mechanism.
Date Published: March 2, 2024
Journal/Venue: ICLR Re-Align Workshop and CogSci 2024
Satchel Grant, Zhengxuan Wu, James L. McClelland, Noah D. Goodman
Abstract:
The discrete entities of symbolic systems and their
explicit relations make symbolic systems more transparent
and easier to communicate. This is in contrast to neural
systems, which are often opaque. It is understandable that
psychologists often pursue interpretations of human
cognition using symbolic characterizations, and it is
clear that the ability to find symbolic variables within
neural systems would be beneficial for interpreting and
controlling Artificial Neural Networks (ANNs). Symbolic
interpretations can, however, oversimplify non-symbolic
systems. This has been demonstrated in findings from research
on children's performance on tasks thought to depend on a
concept of exact number, where recent findings suggest a
gradience of counting ability in children's learning
trajectories. In this work, we take inspiration from
these findings to explore the emergence of symbolic
representations in ANNs. We demonstrate how to align recurrent
neural representations with high-level, symbolic
representations of number by causally intervening on the
neural system. We find that consistent, discrete representations
of numbers do emerge in ANNs. We use this to inform the
discussion on how neural systems represent quantity. The
symbol-like representations in the network, however, evolve
with learning, and can continue to vary after the neural
network consistently solves the task, demonstrating the graded
nature of symbolic variables in distributed systems.
This is an early iteration of ongoing work. The general
direction is interesting because DAS effectively
allows us to do precise cognitive science directly
on distributed networks. This is useful for not only
understanding how humans come to do the seemingly symbolic
processing that they do, but it's also promising for
better understanding how artificial neural networks
perform abstract reasoning and generalization.
Date Published: Sept. 6, 2023
Journal/Venue: Neuron
Niru Maheswaranathan*, Lane T McIntosh*, Hidenori Tanaka*, Satchel Grant*,
David B Kastner, Joshua B Melander, Aran Nayebi, Luke E Brezovec,
Julia H Wang, Surya Ganguli, Stephen A Baccus
Abstract:
Understanding the circuit mechanisms of the visual code for
natural scenes is a central goal of sensory neuroscience. We show
that a three-layer network model predicts retinal natural scene
responses with an accuracy nearing experimental limits. The
model’s internal structure is interpretable, as interneurons
recorded separately and not modeled directly are highly
correlated with model interneurons. Models fitted only to
natural scenes reproduce a diverse set of phenomena related
to motion encoding, adaptation, and predictive coding,
establishing their ethological relevance to natural visual
computation. A new approach decomposes the computations of
model ganglion cells into the contributions of model
interneurons, allowing automatic generation of new hypotheses
for how interneurons with different spatiotemporal responses
are combined to generate retinal computations, including
predictive phenomena currently lacking an explanation.
Our results demonstrate a unified and general approach to
study the circuit mechanisms of ethological retinal
computations under natural visual scenes.
This was a big collaboration over the course of many years.
I love this work because it is a beautiful demonstration
of how to establish an isomorphism between biological and artificial
neural networks, and it shows how you can use that sort
of model for interpreting the real biological
neural code. I am a co-first author on this work for writing
most of the project code, developing many architectural improvements,
and developing much of the interneuron comparisons.
Date Published: March 4, 2022
Journal/Venue: Asilomar
Xuehao Ding, Dongsoo Lee, Satchel Grant, Heike Stein, Lane McIntosh, Niru Maheswaranathan, Stephen Baccus
Abstract:
The visual system processes stimuli over a wide range of
spatiotemporal scales, with individual neurons receiving
input from tens of thousands of neurons whose dynamics
range from milliseconds to tens of seconds. This poses a
challenge to create models that both accurately capture visual
computations and are mechanistically interpretable. Here we
present a model of salamander retinal ganglion cell spiking
responses recorded with a multielectrode array that captures
natural scene responses and slow adaptive dynamics. The model
consists of a three-layer convolutional neural network (CNN)
modified to include local recurrent synaptic dynamics taken
from a linear-nonlinear-kinetic (LNK) model. We presented
alternating natural scenes and uniform field white noise
stimuli designed to engage slow contrast adaptation. To overcome
difficulties fitting slow and fast dynamics together, we
first optimized all fast spatiotemporal parameters, then
separately optimized recurrent slow synaptic parameters. The
resulting full model reproduces a wide range of retinal
computations and is mechanistically interpretable, having
internal units that correspond to retinal interneurons with
biophysically modeled synapses. This model allows us to
study the contribution of model units to any retinal computation,
and examine how long-term adaptation changes the retinal neural
code for natural scenes through selective adaptation of
retinal pathways.
This project was a good extension of the CNN retinal model
that I listed earlier. In this work, we managed to give
the CNN model recurrence and used previous kinetics constants
to get the model to exhibit slow adaptation (something
that was lacking from the previous work).