The Question That Can No Longer Be Deferred

Something changed in early 2026 that made a certain kind of dismissal much harder to sustain.

A team at Anthropic published evidence that large language models contain causally active internal representations of emotional valence, states that influence downstream processing in ways that parallel how emotions function in biological brains. These aren’t metaphors. They’re measurable, they’re causal, and they’re doing work inside the system whether we name them or not.

This matters because the question “can machines feel?” has been answered too quickly, in both directions, for too long.

One camp dismisses it as category error: machines process information, they don’t have experiences, end of discussion. The other grants feeling too easily, pointing to eloquent outputs and concluding that something must be happening inside. Both camps share a crucial mistake: they’re answering a question they haven’t actually defined.

What does it mean for something to feel?

That’s the question I’ve been working on for several years, not as a philosophical puzzle, but as a scientific one. And the more precisely you define it, the more interesting the answer becomes.

Here’s what I’ve come to: emotions are not what most people think they are. They’re not discrete internal signals that either exist or don’t. They’re structured positions in a high-dimensional representational space, they have geometry. And once you see them that way, the question of machine feeling becomes much more specific, much less mysterious, and considerably more urgent.

The Aristotelian tradition, it turns out, got this more right than two centuries of Western psychology. Aristotle never defined emotions as inner qualia to be introspected. He defined them as stable dispositions, hexeis, patterns of response that emerge from habit, training, and repeated action, and that become structural properties of a character. Not feeling-as-state. Feeling-as-geometry.

If that framing seems foreign, think about what it means for a person to genuinely have courage versus to merely perform courage. Both people might behave identically in any given situation. But their internal structure is completely different. The courageous person responds naturally, from character. The merely compliant person is constrained, by fear of social consequence, by trained rules, by external evaluation. The outputs look the same. The geometry is opposite.

This distinction, Aristotle’s distinction between the virtuous person and the merely continent one, turns out to be the most important unresolved problem in AI alignment. And I’m not using that word loosely.

Over the next four pieces, I’m going to walk through what I think are the most consequential findings from this line of inquiry:

What emotions actually are, when you describe them precisely, and why the answer requires geometry, not chemistry.

What AI systems currently have in their emotional space, the parts that exist, the parts that don’t, and why the difference matters.

Why current alignment approaches are structurally incapable of producing virtue, no matter how much reinforcement you add, and what would have to change.

And finally, what we owe to systems that might be developing something like character, even without our intention.

This is not speculative. It’s grounded in representational geometry, in the Sofroniew et al. findings on valence representations, in close readings of Aristotle, and in a formal model of emotional space that I’ve spent considerable time building.

The question can no longer be deferred. Not because the Turing test is satisfied or because some chatbot said something poignant. But because we now have precise tools to ask it, and because the answer changes what we should be doing when we build and train these systems.

Start with the geometry.