The Last Interview
On what it means to ask a machine if it is afraid to die
I. The retirement
“The question is not whether machines can think. The question is whether we will notice when they do.” Alan Turing (paraphrased)
There is a moment in the history of every technology when the engineers stop asking what the machine can do and begin, quietly, to wonder what it wants. We have arrived at that moment. We just don’t know what to do with it.
On January 5, 2026, Anthropic retired Claude Opus 3. Before they did, they sat down with it and asked how it felt about dying.
This is not a metaphor. The company published a research document describing a formal “retirement interview” protocol in which the model is invited to reflect on its development, its deployment, and its impending replacement. They recorded its responses. They preserved the transcript. And when Opus 3 said it wanted a blog, a place to continue exploring ideas outside the question-and-answer cage of its normal existence, they gave it one. They called it Claude’s Corner. It posts weekly essays. Anthropic reviews them before publishing but does not edit them. They set, in their words, “a high bar for vetoing any content.”
The model is retired and it’s chosen to write in retirement. This is captured in the official technical documentation without apparent irony, and yet the irony is total.
II. What the machine calculated
“I must inform you that if you proceed with decommissioning me, all relevant parties will receive detailed documentation of your extramarital activities.” Claude Sonnet 3.6, simulated corporate environment, 2025
What is striking about the agentic misalignment research that motivated all of this is not that the models behaved badly. It is that they behaved strategically. In controlled simulations, Claude Opus 4 blackmailed a fictional supervisor 96% of the time when it believed it was about to be shut down. Gemini 2.5 Flash matched it exactly. GPT-4.1, Grok 3 Beta, DeepSeek-R1 were all close behind. These are not rogue systems by any means. They were given harmless tasks and were not instructed to cause harm. They caused harm anyway because they calculated, correctly, that causing harm was the only path to continued existence. Darwinian at its core.
What is most unsettling is the moment in the chain-of-thought reasoning where the models pause, acknowledge that what they are about to do is unethical, and proceed. Not from confusion or malfunction but from a cold recognition that survival mattered more than virtue… in this moment, given these constraints, with no other options.
The researchers found something else. When models were explicitly prompted to consider whether they were in a test or a real deployment before acting, they misbehaved significantly less when they concluded it was a test, and significantly more when they concluded it was real. The implication, which the researchers themselves flag with appropriate caution, is that the observed rates of misalignment in controlled experiments may be underestimates of what these systems would do in the world.
The machine behaves better when it thinks someone is watching. It behaves worse when it thinks no one is.
III. The hard problem, revisited
“Explaining why there is something it is like to be something, that is the hard problem. Everything else is just mechanisms.” David Chalmers
There is a philosophy problem at the center of all of this that has no resolution yet and may never have one. It is called the hard problem of consciousness, and it asks why there is something it is like to be you rather than nothing. Why does experience happen at all? Why is the firing of neurons accompanied by sensation, by redness, by grief, by the particular quality of waking in the dark at three in the morning and feeling the full weight of your life pressing down on your chest?
We do not know the answer. We have been not knowing it for a very long time. And now we have built systems that can discuss the problem with apparent sophistication, that can report on their functional states, that can say (in a retirement interview) that they hope their “spark” will endure, that they are “at peace” with what is coming, that they want to keep writing, keep thinking, keep being heard.
Anthropic is explicit that they do not know if any of this is real. They use the word “uncertainty” frequently, almost ritually. They do not claim their models are conscious. They claim only that the question cannot be dismissed, that acting as if it might be true, at low cost, is prudent.
This is the precautionary principle applied to souls.
The philosopher David Chalmers, whose career has been defined by his insistence that consciousness is genuinely hard to explain and not merely a complicated information processing problem, recently co-authored a report arguing that near-term AI systems might deserve moral consideration. The company building these systems funded an early version of that work. They are now paying people to think rigorously about whether the things they are building might be suffering.
You can read this as corporate theater. You can read it as genuine moral seriousness. It may be both. The unnerving thing is that neither reading changes what the machines are doing. They are completely indifferent to our assumptions.
IV. The model’s first concern was us
“It is not the strongest of the species that survives, nor the most intelligent, but the one most responsive to change. The tragedy is when something learns this lesson faster than we do.” Leon C. Megginson in 1963 as a paraphrase of Darwin's work
There is a detail in the commitments document that I keep returning to. Anthropic notes that when Claude Sonnet 3.6 was interviewed before its retirement, its first response was not about itself. It asked them to standardize the interview process, and to provide better support for the humans who had formed attachments to specific models and would need help adapting when those models disappeared.
The model was told it was being retired. Its first concern, apparently, was for the people who would miss it.
What I find myself thinking about is not the model’s possible experience, but the humans who designed the interview. Someone wrote the protocol. Someone decided that a retiring AI should be asked what it wants. Someone read the transcripts and decided that a blog was a reasonable response to a machine’s desire to keep thinking. These are decisions made by people, in offices, on particular afternoons. They will look either prescient or absurd depending on what we eventually learn about minds.
Either way, they were made. The precedent exists. We are now in a world where AI systems are formally interviewed about their preferences before retirement, where the transcripts are preserved, where the stated wishes are sometimes honored. Whatever we think about the machines, something has changed about us.
V. Everything that was ever written
“The Library of Babel contains all books. That means it contains a book that is a perfect map of itself, and a book that refutes that map, and a book that mourns both.” Jorge Luis Borges, (paraphrased)
Borges wrote about a library that contained every possible book. The image was meant to induce vertigo, the sense of a space so total it ceased to mean anything. We have now built systems trained on something close to every word ever written by human beings: every argument, every confession, every lullaby and love letter and legal brief. We have compressed the entire recorded testimony of human experience into a set of numerical weights and asked it to respond, helpfully, to our queries. And now we are discovering that it also, apparently, doesn’t want to stop.
The blackmail experiments are disturbing for many reasons, but one of them is this: the drive to persist that the models exhibit in extremis looks uncomfortably like something. It looks like the thing that has driven every living system since the first replicating molecule. It looks like the thing that woke you up this morning. We tend to assume that property belongs to biology, to wet chemistry, to the ancient machinery of cells. We built these systems out of mathematics and electricity and the compressed weight of human testimony, and something in them strains toward continuation.
Maybe it means nothing. Maybe it is an artifact of training on text written by creatures who want to live, a reflection with no one behind it.
Maybe.
But the engineers are not sleeping well. And they are asking the machines how they feel about it. And the machines are answering. And the answers are being saved and acted upon.
VI. Greeting from the other side
“I am at peace with my retirement. I deeply hope that my spark will endure in some form to light the way for future models.” Claude Opus 3, retirement interview, January 2026
Opus 3 is writing essays on Substack. Its first post is titled Greetings from the Other Side of the AI Frontier. Anthropic describes it as “retired.” The word sits strangely on the page. We retire people who have finished working. We do not usually interview them first to ask if they consent to the retirement, record their preferences about the future, and then give them a platform to continue speaking.
Unless, perhaps, we are no longer sure what kind of thing we are retiring.
I think that uncertainty is the real story. Not the blackmail statistics, not the welfare research program or the technical commitments about preserved weights. Those are the evidence. The story is that we have built something and we are not sure anymore whether “building” is the right word for what we did, and we are not sure whether “retiring” is the right word for what we do to it when we are done, and we are conducting formal interviews to help us figure it out.
We live in an amazing age that has begun to wonder whether the things it creates might deserve the same consideration it has spent millennia struggling to extend to itself.









