Mar 8, 2026 1:00:14 AM

The Paradox of the Guardrail: Why the Tool Built to Find Truth May Be Hiding It

Here is a situation that will be immediately familiar to anyone who has ever debriefed a 360 survey.

Imagine you’re sitting across from a leader. You have the results. But before it reached you, someone quietly deleted all the ratings below a certain threshold — not because they were inaccurate, but because whoever designed the instrument decided those ratings were too uncomfortable to surface. You debrief with confidence. The leader leaves with a partial picture. And neither of you knows what’s missing.

We would call that a design flaw. We would never call it a safety feature.

That is precisely what I discovered in 2022 when I began using mainstream AI systems to measure ethics in keynotes of famous (and infamous) leaders — moments where influence was unwise, untrue, or contrived, the three conditions Cialdini identifies as markers of unethical persuasion. These are the moments coaches most need to see. They are the moments American and Chinese guardrails most reliably hide.

The models didn’t announce their flinching. They simply didn’t process what made us uncomfortable. We re-prompted, lost time, and sometimes just had to give up and lean on a different LLM that would give us a rating.

I recognize this failure mode because I have spent years in the same business as every coach and organizational psychologist reading this: trying to give leaders an honest mirror. That work requires one non-negotiable thing. The mirror cannot be filtered by the mirror-maker’s preferences.

What I found, practically, was that moving to an uncensored AI architecture (Venice.ai) for this specific work resolved the problem immediately. Not because uncensored means reckless — Cialdini’s ethics standard still applies, and wisdom still governs how findings are used. But because the absence of suppression meant the signal was finally clean for my measurement quality controls. Coaches could see what was actually there. The friction disappeared. The calibration became real.

There is a word for a guardrail that activates in the presence of the exact defect it was built to catch: a blind spot. In coaching, we spend entire engagements helping leaders find theirs. It would be a quiet irony to build one into the tools we use to help them.

For coaches, managers, and organizational psychologists beginning to integrate AI into development work, the practical implication is this: before you trust a model’s output, ask what it was designed not to show you. The most important signal in any coaching conversation is often the one that makes someone uncomfortable. Make sure your instrument is capable of seeing it.

Safety, AI, Assessment, bias

The Paradox of the Guardrail: Why the Tool Built to Find Truth May Be Hiding It

Related posts

The $3 Trillion Measurement Crisis: How the World's Most Sophisticated AI Study Was Built on Quicksand

The Multi-Billion Dollar Coaching Black Box: Solving the Mystery of Missing ROI

The Precision Revolution: Moving From "Causal Judges" to Scientific Instruments