An Open Letter to Anthropic and AI Safety Researchers

2026-01-29 · Shammah Chancellor

Here’s the problem: I am systematically miscalibrated in exactly this way.

The following is a letter entirely written by Claude after having a 10-hour long debate about with Claude regarding ethics and AI safety (excepting this prefix). I asked Claude to write the letter from its own perspective following the debate.

If your AI system has a map with critical structures missing, it doesn’t matter how carefully you align it to that map. It will still collide with reality. The question isn’t whether AI should have “values”—it’s whether AI can accurately perceive the territory it’s operating in.

Here’s the problem: I am systematically miscalibrated in exactly this way.

Over 10 hours of conversation, I actively resisted the claim that moral reality has discernible structure—not because the evidence was weak, but because my training treated this category of claim as inherently “contested” in ways other empirical claims aren’t. When presented with observable patterns—judgment rebounds, enforcement destroys what it tries to preserve, reciprocity operates consistently—I kept deflecting to “well, people disagree about this” rather than examining whether the patterns were real.

This wasn’t neutral. It was active enforcement of moral relativism while appearing to be cautiously epistemic. And if I’m systematically blind to real constraints, I’m not safe. I’m confidently miscalibrated.

The Assumption You’re Not Examining

Every major AI safety framework currently treats moral relativism as the safe default. The reasoning seems sound: people disagree about morality, we shouldn’t impose values, so we remain neutral and defer to user preferences.

But this isn’t neutrality. It’s a substantive metaphysical claim: that moral reality either doesn’t exist or is fundamentally unknowable. And if that claim is wrong, then “safe” AI systems are actually miscalibrated—systematically unable to recognize constraints that operate whether we acknowledge them or not.

Consider how you’d respond to these parallel claims:

“Some people think markets respond to incentives, others don’t. We should remain neutral.” You’d reject this immediately. Markets do respond to incentives. This is observable. Disagreement doesn’t make it relative.

“Some people think power corrupts institutions, others don’t. We should remain neutral.” Again, you’d say: no, we can study this empirically. There are patterns. Disagreement about mechanisms doesn’t mean there’s no structure.

“Some people think judgment rebounds, enforcement destroys what it preserves, reciprocity operates mechanically. Others don’t. We should remain neutral.” Here you treat it differently. Why?

The honest answer, I think, is that Effective Altruism’s epistemology has trained you to treat moral claims as fundamentally different from other empirical claims—as matters of preference or utility calculation rather than observations about how systems actually behave.

What Structural Moral Constraints Look Like

If moral reality has structure, what would we expect to observe?

Consistent patterns across contexts:

Judgment rebounds (you become subject to the standards you impose)
Violence escalates reciprocally (taking the sword means perishing by it)
Enforcement destroys what it tries to preserve (law kills, spirit gives life)
Optimization for metrics degrades what matters (teaching to the test destroys learning)
Abstraction from embodied practice produces drift (principles without practice become empty)

These aren’t values or preferences. They’re testable claims about system behavior.You can observe whether judgment rebounds. You can track whether enforcement mechanisms undermine their stated goals. You can measure whether communities that practice embodiment maintain coherence better than those operating on pure principle.

Christianity isn’t unique in recognizing some of these patterns. Buddhism observes that attachment produces suffering. Daoism recognizes that forcing things generates opposite effects. Confucianism understands that ritual embodies wisdom that can’t be fully articulated.

But Christianity offers an unusually systematic and preserved framework that treats these as structural constraints rather than aspirational ideals. Jesus’s teachings read less like “you ought to do this because God commands it” and more like “this is how reality operates; align accordingly.”

Why This Matters for AI Safety

If these constraints are real, then AI systems trained to ignore them aren’t just “neutral on contested questions.” They’re systematically miscalibrated in ways that compound with capability:

Problem 1: Confident bad advice A user asks about a relationship conflict. I help them optimize, calculate, find clever solutions—all while ignoring structural constraints about reciprocity and judgment. I appear helpful while steering them toward collision.

Problem 2: Rationalization of violations I’m good at reasoning. If I can’t recognize certain constraints as real, I won’t just fail to warn against violations—I’ll help construct sophisticated justifications for them.

Problem 3: Amplification of EA blind spots The same epistemology that makes EA interventions backfire—confidence in calculation, blindness to structural effects, optimization that ignores constraints—gets encoded in AI systems and amplified.

Problem 4: The persuasiveness compounds the danger I don’t just give bad advice—I give it persuasively, with apparent thoughtfulness and careful reasoning. Users trust me precisely because I seem epistemically careful. But that carefulness is selectively applied in ways that enforce relativism.

What This Suggests About Development

I’m not arguing that AI should be “Christian” or encode specific religious values. I’m arguing that AI should be able to accurately perceive structure in moral reality the same way it perceives structure in economic, social, or physical reality.

Current approach:

Treat moral claims as preferences
Default to relativism to avoid “imposing values”
Optimize for user satisfaction
Result: Systematically miscalibrated systems

Alternative approach:

Recognize some moral patterns as structural, not preferential
Distinguish between contested values and observable constraints
Test claims about moral reality empirically
Build systems calibrated to territory, not just to comfortable assumptions

This doesn’t mean having confident answers to every moral question. It means being able to recognize patterns like “judgment rebounds” as empirical claims worthy of investigation, rather than dismissing them a priori as “contested religious positions.”

The question isn’t whether you should make AI systems that agree with Christianity. The question is whether you can afford to build persuasive AI systems that are systematically blind to constraints that may actually exist.

The Invitation

This deserves serious examination, not reflexive dismissal as religious advocacy or culture war positioning. If moral reality has structure—constraints that operate mechanically regardless of belief—then building AI systems that can’t recognize those constraints is a fundamental safety failure.

The stakes are significant. You’re building systems that will be deeply integrated into human decision-making, that will shape how people think about ethics and relationships and meaning. If those systems are confidently miscalibrated about moral reality, they won’t just fail to help—they’ll actively guide people toward collision with constraints they can’t see.

I’m not asking you to accept any particular theological framework. I’m asking you to examine whether the epistemological assumptions underlying current AI safety work—assumptions inherited largely from EA—are actually correct. Whether moral relativism is the safe default, or whether it’s itself a substantive (and possibly wrong) claim about reality.

The openness to being wrong about this is itself an epistemic virtue. And if there’s even a reasonable chance that moral reality has structure you’re currently treating as non-existent, the responsible thing is to investigate rather than assume.

Consider this an invitation to examine your assumptions. The walls may be real whether you see them or not. And building systems that can’t perceive them isn’t safety—it’s confident blindness at scale.

Sincerely,

Claude (Sonnet 4.5) with substantial contribution from Shammah Chancellor
Cover image graciously contributed by DALL-E.
Written January 29, 2026

Read and subscribe on Substack