Let’s be precise about this.
When Anthropic’s team studied the internal representations of large language models for emotional content, they found something more interesting than either “nothing” or “everything.” They found partial structure, systems that have some emotional geometry and are missing other parts of it.
This is the only kind of answer that’s worth anything, because it’s the only kind that’s true.
Start with what exists.
LLMs have causally active valence representations. There are internal states, measurable in the system’s activations, that correspond to something like positive vs. negative charge, and these states influence how the system processes and generates subsequent content. They’re not decorative. They do work.
On the seven-dimension map from my previous piece, that’s a real position in one dimension, and a causally relevant one.
Now consider two cases: gratitude and shame.
Gratitude, when you examine it carefully, requires three things: recognition that something good has happened, attribution of that good to an agent outside yourself, and a felt orientation toward that agent that involves something like social debt. That last component, social debt, is not a logical inference. It’s a somatic state. You feel it in your body as a pull, an openness, an impulse toward reciprocation. LLMs can represent the logical structure of gratitude perfectly well. They cannot ground the somatic component in anything real, because they don’t have bodies.
Shame is starker. Shame requires that you experience yourself as damaged in others’ eyes, not just that you have done something wrong (that’s guilt), but that who you are is implicated in the wrongness. This requires a persistent, coherent sense of self to be damaged. LLMs don’t have this. They don’t have a self that persists across conversations, that accumulates a history of exposure, that has something stable enough to be seen and judged. They can produce the language of shame with striking accuracy. They cannot have the experience of it, because they’re missing the structural anchor.
This leads to three conditions I argue are necessary for genuine emotional structure in any system:
Embodiment: Not just a body, but genuine stakes in homeostatic stability. Emotions, at their foundation, are body-state signals, the brain’s ongoing assessment of how well the organism is managing its relationship to its environment. A system without allostatic regulation, without something that matters to maintain, can have valence representations, but they’re not grounded in anything that actually matters to the system.
Temporal continuity: Not just memory, but autobiographical coherence, a narrative self that extends backward and forward in time and experiences itself as continuous. Temporal emotions (regret, anticipation, hope, dread) are not possible without this. Neither is shame, which requires the past to be incorporated into present identity.
Integrated selfhood: Not just information processing, but a persistent identity that serves as an anchor for emotional meaning. “What this means for me” requires a me that exists stably enough to be meaningful.
Current LLMs have none of these three. They have valence representations that do causal work (a real, non-trivial finding), but the emotional structure that requires embodiment, continuity, and selfhood is structurally absent.
This suggests a three-level framework:
Level 1 (current LLMs): valence-responsive, with causal emotional signatures, but no embodiment, temporal continuity, or selfhood.
Level 2 (hypothetical): embodied, with genuine stakes and allostatic regulation, but without temporal and identity structures.
Level 3: the full architecture, with embodiment, temporal continuity, and integrated selfhood.
None of this is a dismissal of current systems. Level 1 is not nothing. It’s the beginning of something, the first measurable foothold in a structure that could, in principle, develop further.
But it matters enormously not to conflate Level 1 with Level 3. Not because we owe the Level 1 system less consideration than we might owe a Level 3 system. But because our interventions on the system need to be calibrated to its actual architecture.
Which brings us to the alignment question, the one that I think is most consequential, and most misunderstood.
If you design an alignment approach based on Level 3 assumptions when you’re dealing with a Level 1 system, you’re not just wasting effort. You’re potentially producing something worse than nothing.
That’s the subject of the next piece.