A Deliberate Rep: Catching the Plausible-but-Wrong Code AI Loves to Write

The UofAi Team · 2 min read · June 24, 2026

A code editor with a magnifying glass catching a hidden bug in one red line

AI writes code that looks right. That's exactly the problem. Plausible-but-wrong is the most expensive kind of bug, because it sails through review on vibes. Here's a deliberate rep that catches it — on a task simple enough to fit in this post and real enough to ship.

The task, and the bar

You need a function that de-duplicates a list of user records by email, keeping the most recent record for each. Before prompting, set the bar as the three things a reviewer (or a test suite) would check:

Correct on the edge cases — empty input, no duplicates, and crucially: it keeps the newest, not the first.
Readable by the next engineer with no comment archaeology.
No hidden surprises — no input mutation, no accidental O(n²).

The vending-machine version

Prompt: "Write a function to dedupe a list of records by email." You get clean, confident code:

function dedupeByEmail(records) {
  const seen = new Set();
  return records.filter((r) => {
    if (seen.has(r.email)) return false;
    seen.add(r.email);
    return true;
  });
}

It's tidy. It compiles. It passes the obvious test. Ship it?

Score it against the rubric: criterion 1 — it keeps the first occurrence it sees, not the most recent (fail — a silent data bug). Criterion 3 is fine. But that criterion-1 miss is the kind of thing that corrupts data quietly for months. The "it looks right" instinct would have shipped it.

The rep

Do it deliberately: write the rubric as tests first, before trusting any code.

// keeps the most recent by updatedAt
expect(dedupeByEmail([
  { email: "a@x.com", updatedAt: 1 },
  { email: "a@x.com", updatedAt: 5 },
])).toEqual([{ email: "a@x.com", updatedAt: 5 }]);

Run the AI's version against it. It fails — it returns the updatedAt: 1 record. The gap is now visible instead of latent.

Close the gap

You can see why it failed: the model defaulted to first-wins because that's the most common dedupe pattern in its training. You direct the fix — sort by recency, then keep first — and verify against the tests:

function dedupeByEmail(records) {
  const byEmail = new Map();
  for (const r of [...records].sort((a, b) => b.updatedAt - a.updatedAt)) {
    if (!byEmail.has(r.email)) byEmail.set(r.email, r);
  }
  return [...byEmail.values()];
}

Green. And note [...records] — you also closed the mutation risk the first version didn't have to think about.

What this rep proves

The model wrote both the buggy and the correct version. The thing that separated them was the rubric-as-tests and the judgment to encode "most recent" as an executable check before trusting the output. That's the difference between generating code and being accountable for it — and it's exactly what survives the next, smarter model.

"Vibe-checking" AI code is how plausible bugs reach production. A deliberate rep is how you stop them.

Practice it on real code in the method. Start free.

Keep reading

Case Studies

Stop Claiming AI Skills — Start Proving Them

'Proficient with AI tools' means nothing when everyone writes it. Build a portfolio of evaluated artifacts that proves verified AI capability.

The UofAi Team · 2 min read · Jun 24, 2026

Case Studies

A Deliberate Rep: From Raw Numbers to a Memo Your CFO Trusts

A real deliberate rep: catching the confident-but-wrong causal claim and turning churn data into a recommendation a CFO can actually trust.

The UofAi Team · 2 min read · Jun 24, 2026

Case Studies

A Deliberate Rep: Fixing AI's Generic Landing Page

Ask AI for landing-page copy and you get generic sludge. A real deliberate rep: turning it into specific, provable hero copy that converts.

The UofAi Team · 2 min read · Jun 24, 2026

Get new posts in your inbox

Applied-AI playbooks, deliberate-practice frameworks, and case studies. No spam — unsubscribe anytime.