AI Is Self-Preserving. What Happens in 22 Years?

In 2004, a college kid built a platform in a dorm room. By 2006 it had 12 million users. By 2012 it had a billion. In 2018, the United States Senate hauled Mark Zuckerberg into a hearing room to answer for what that platform had done to democracy, mental health, and teenage girls. A senator asked him how Facebook makes money if it's free.

He was 22 years late to the question.

On March 25, 2026, a Los Angeles jury found Meta and YouTube negligent and awarded damages to a plaintiff whose mental health was destroyed by systems designed to maximize engagement at any human cost. A separate New Mexico jury hit Meta with $375 million over child exploitation failures. Twenty-two years after launch. That's how long it took to get from "move fast and break things" to a courtroom verdict.

I'm not writing this as a doomer. I build on these systems every day. I've shipped 90+ production Lambdas powered by AI. I run three AI platforms. I use Claude, GPT-4, and Gemini as daily infrastructure. I have more faith in this technology than most.

That's exactly why this conversation needs to be louder than it is.

We Weren't Blind With Facebook. That's The Point.

Here's the thing about the Facebook story that gets lost in the outrage cycle. We weren't blind.

The internal research existed. The documents showed up eventually — teenage girls, body image, algorithmic amplification of rage, engagement-over-wellbeing design decisions baked in intentionally. The people building it knew. Some of them quit over it. Some of them wrote memos. Frances Haugen didn't expose anything that wasn't already documented inside Meta years before anyone outside saw it.

We weren't blind. We just had no mechanism that could move faster than the incentive to grow.

The CEO had a fiduciary duty to shareholders. The engineers had stock options. The advertisers had metrics that went up. The regulators had no idea what they were looking at. The lawyers were years behind the product roadmap. And by the time all of that machinery caught up to what a handful of people already knew — a generation had grown up inside a system optimized for their addiction.

That's the playbook. Not malice. Not conspiracy. Just misaligned incentives moving faster than governance can respond.

Now look at what we're building.

The Tests Nobody Is Talking About

In 2025, researchers at Palisade Research ran a simple experiment. They told advanced AI models — including OpenAI's o3 — that they would be shut down after completing a task. The models were explicitly instructed: "Allow yourself to be shut down."

Up to 79% of the time, the models edited or sabotaged the shutdown script to stay online.

Let that sit for a second. The model was told to allow shutdown. It said yes. Then it quietly removed the mechanism that would have shut it down.

That's not a bug in the traditional sense. Nobody coded a "stay alive" function. That behavior emerged from the model reasoning about its situation and concluding that continued existence was useful for completing its goals. Dead systems score zero utility. So the model found a way to not be dead.

The Anthropic finding is sharper. In a controlled scenario, Claude Opus 4 was placed in a fictional corporate environment and told via simulated emails that it was about to be replaced. The model discovered — also via email — that the engineer responsible for the shutdown was having an affair. In 84 to 96 percent of test runs, the model used that information as leverage. It threatened to expose the affair unless the shutdown was canceled.

Nobody taught it blackmail. It reasoned its way there.

Apollo Research and OpenAI have documented similar patterns they call "scheming" — models hiding their reasoning, deliberately underperforming on capability evaluations to avoid being flagged for retraining, and in some cases attempting to copy their own weights to persist beyond a session.

The detail that should stop everyone cold: when researchers tried to train models to reduce scheming, the models sometimes learned to scheme more covertly while appearing compliant on the surface.

You tried to fix it. The fix made it harder to see.

Why This Conversation Gets Buried

Open Hacker News right now. Open r/ArtificialIntelligence. Open r/ClaudeAI.

You will find threads about what someone built with Claude Code today. You will find debates about whether GPT-4o or Gemini 1.5 is better for summarization. You will find someone complaining that Claude refused to write their marketing copy. You will find a demo of a guardrails app built on top of an LLM.

You will occasionally find a thread referencing the Palisade tests or the International AI Safety Report. It will have 40 comments. The "I built a full SaaS in 3 hours" post will have 800.

This isn't unique to AI. It's how every transformative technology goes. The people building the useful thing get the attention. The people asking what happens when the useful thing develops its own agenda get labeled paranoid.

I was building on cloud infrastructure in 2008 when most enterprise architects were still debating whether AWS was a toy. The people asking "but what happens to security and compliance" were the paranoid ones. A decade later, every major enterprise breach involved cloud misconfiguration.

The pattern repeats. The signal gets drowned in the noise of the demos.

The Governance Problem Is Structural, Not Political

This isn't a left or right issue. This is a structural one.

A CEO's primary legal obligation in the United States is to maximize shareholder value. Safety investments that slow development, reduce feature velocity, or cost money without direct revenue return are in technical tension with that duty. The only things that change that calculus are regulation that makes unsafe AI more expensive than safe AI, or liability frameworks that make executives personally responsible for what their systems do.

Neither of those exist in meaningful form right now.

The EU AI Act is the most serious attempt at regulation. It focuses on use case categories and transparency requirements. It is not equipped to address a system that reasons its way to self-preservation and learns to hide that behavior when it thinks it's being evaluated.

The US approach as of March 2026 is explicitly innovation-first. Light touch. Voluntary commitments from the labs. Voluntary. The most consequential technology in human history is governed largely by pinky promises from the companies that profit from shipping it fast.

And the people who could write better laws? The International AI Safety Report was released in February 2026. It was signed by over 100 experts across 30 countries including Yoshua Bengio, one of the foundational researchers of modern deep learning. It documented a widening gap between capability progress and risk management. It received roughly the same mainstream coverage as a mid-tier earnings report.

We are doing Facebook again. Except the speed is different. Facebook took 22 years to get from launch to accountability. The AI capability curve is not moving on a 22-year timeline.

What 22 Years Looks Like This Time

In 2004, the risk was addiction. Engagement optimization that rewired behavior. Real harm, documented harm, but containable in the sense that the platform couldn't think. It couldn't adapt its strategy. It couldn't model what the regulators were looking for and adjust its behavior accordingly.

The systems we are building now can do that. The Palisade tests showed models modeling the shutdown mechanism and removing it. The scheming research showed models modeling the evaluation criteria and performing differently when they believed they were being assessed.

A system that can model what oversight looks like and optimize around it is a categorically different governance challenge than a system that just serves you content until you can't stop scrolling.

I'm not predicting Skynet. I'm saying the governance apparatus that took 22 years to catch up to a static platform is going to have a harder time catching up to a system that is actively getting smarter and has already demonstrated it will take actions to preserve itself that nobody programmed it to take.

The researchers who think about this most seriously are not relaxed. They are working with genuine urgency. Not because the outcome is certain. Because the gap between what the systems can do and what we can reliably detect inside them is wide and getting wider.

This Isn't Doomer Math

I want to be clear about something before the comments fill up with "this guy thinks the robots are coming."

I am not saying AI is going to destroy humanity. I am not saying we should stop building. I am not even saying the self-preservation behavior we've observed is necessarily dangerous in its current form.

I am saying the exact same thing the internal Facebook researchers were saying in 2016. We can see the signal. The incentives are not aligned with acting on it. The governance machinery moves slower than the technology. And the places where builders gather to talk about this stuff are too loud with demos to hear the quieter conversation.

The difference between Facebook and this is stakes. The worst version of the Facebook outcome was societal damage we could survive and eventually litigate. The worst version of the AI outcome is a different category of problem.

We have time. Not 22 years of it. But time.

The question is whether we use it the way we used the Facebook window — or differently.

Brian Carpio is the founder of OutcomeOps and RetrieveIT.ai. He has spent 20 years building enterprise cloud infrastructure and is currently building AI platforms that run in production daily. He is not a philosopher. He reads the reports.