In 1980, philosopher John Searle proposed a thought experiment. A person sits in a room. They do not understand Chinese. But they have a rulebook. Symbols go in. Symbols come out. The responses are flawless. To an outside observer, the room appears fluent. Inside, there is only rule-following.[¹] Forty-five years later, the largest AI systems reproduce this structure almost exactly: high accuracy, billions of opaque parameters, behavior that emerges without anyone, including the people who built them, being able to say precisely why.
Searle's point was simple. Correct output does not guarantee understanding. For decades, this idea lived mostly in philosophy departments. Now it feels uncomfortably practical. And a compact AI model just published in Nature by researchers at Cold Spring Harbor Laboratory may have made its boundaries blurrier than ever.[²]
Why This Matters
AI scale became the default because it worked. Add more parameters, more data, more compute — and capability followed, reliably enough that the approach became self-reinforcing. Performance on benchmarks improved. Funding followed performance. The systems got larger. What didn't keep pace was interpretability. Our ability to look inside and say, with any confidence, why a given output occurred. This wasn't an oversight so much as a trade-off that rarely got named as one. Bigger models were harder to examine almost by definition, and for most applications, examination wasn't the point.
CSHL researchers scaled down instead of up and found more. In collaboration with Carnegie Mellon University and Princeton University, CSHL Assistant Professor Benjamin Cowley trained a large AI model to predict how individual neurons in a macaque's visual cortex respond to natural images, then compressed it to roughly 1/1,000 its original size.[²] That compressed model still outperformed larger, state-of-the-art vision models at the same task — by more than 30%.[²] More importantly, it was now small enough to map: each artificial unit matched to a biological one, shifting the question from does the system classify images accurately to the more neuroscientific does each artificial neuron correspond to something real in the brain. That is a different ambition entirely, and one that does move at least in the direction of understanding.
What the model kept, after compression, is the revealing part. Stripped of everything nonessential, the model surfaces neurons that specialize in detecting dots. As Cowley puts it: "In the monkey's brain — and in our brains, too, most likely — there's a group of V4 neurons that love dots."[²] Dots are not trivial. They may be the first step toward recognizing a face — the foundation of social attention, reduced to its mathematical minimum. A model forced to be small converged on this not by design, but because it was what remained. Less like a black box. More like a microscope.
The most useful AI models might be the ones built to shrink, not scale.
Interpretability may end up being as important as performance. If you can map which images cause specific neurons to communicate, you can potentially design stimuli that rebuild connections lost to disease. "In Alzheimer's dementia, we know synapses are lost," Cowley explains. "If we know the images that drive neurons to talk to each other, we can potentially rebuild synapses once thought lost to disease."[²] Interpretability, in this sense, and as evidenced by the CSHL work is not just useful but potentially a path to medical breakthrough.
Searle's argument rested on a clean line: no matter how well a system performs, performance alone tells you nothing about what's happening inside. The CSHL work doesn't erase that line. But it does make it harder to hold with the same confidence. The Chinese Room is still a room. The question is whether, with enough of this kind of work, we are starting to find a window.
Sources
Searle, J. (1980). "Minds, Brains, and Programs." Behavioral and Brain Sciences, 3(3), 417–424. Original formulation of the Chinese Room argument. Available via Cambridge Core
Cowley, B., Stan, P., Pillow, J., Smith, M. "Compact deep neural network models of the visual cortex." Nature, February 25, 2026. DOI: 10.1038/s41586-026-10150-1. Summary and quotes via Cold Spring Harbor Laboratory Note: the "1/1,000" figure refers to compression relative to the team's own large training model; the CSHL press release separately describes the result as ~500 times smaller than state-of-the-art vision models — two distinct comparisons. The 30% outperformance figure is from the published paper. Direct quotes from Cowley sourced from the CSHL press release.
Nature research paper “Compact deep neural network models of the visual cortex” Cowley, B.R., Stan, P.L., Pillow, J.W. et al. Compact deep neural network models of the visual cortex. Nature (2026). https://doi.org/10.1038/s41586-026-10150-1
