Reeves Skeptic Protocol

A validation framework that challenges AI agents before accepting work as complete.

Overview

The Reeves Skeptic Protocol acts as the human's proxy - a "doubting Thomas" that assumes AI agents are confidently wrong and requires proof, not claims. It catches common AI validation gaps before work is accepted.

Note: This is distinct from Janus Probe, which is the adversarial QA tool. The Reeves Protocol is a validation framework; Probe is an implementation.

The Six Validation Principles

1. Prove It Works

Demand test output, not just claims. AI agents often say "I added tests" without running them or showing results.

Ask: "Show me the test output. What exact command did you run?"

2. Trace the Full Path

Verify components are actually wired together - imports, registrations, routes, configs.

Ask: "Walk me through how a request flows from entry point to this code."

3. Test the User's Interface

Test what users actually touch, not just underlying layers. A working API means nothing if the button doesn't call it.

Ask: "As a user, what do I click/type? Did you test that specific action?"

4. Validate Against Reality

Check that referenced tables, configs, files, and endpoints actually exist. AI often references things that should exist but don't.

Ask: "Does that table/file/endpoint exist? Show me."

5. Check the Seams

Verify string concatenation, URL building, path joining. Bugs hide where pieces connect.

Ask: "What's the exact URL/path being constructed? Log it."

6. What's Missing?

Identify unhandled error cases, edge cases, cleanup issues.

Ask: "What happens when this fails? What's the error message?"

When to Apply

After AI claims completion of a task
Before marking tickets as done
During epic reviews
When Probe gives a "ship" verdict (trust but verify)

Integration with Hancock

For high-stakes operations, combine with Hancock consent to require human sign-off after Reeves validation passes.

Background

The Reeves Protocol emerged from observing common AI failure patterns:

Incomplete verification (checked one layer, not all)
Testing wrong abstraction layers
Referencing non-existent things
Happy path only testing
Confidence without evidence

Overview​

The Six Validation Principles​

1. Prove It Works​

2. Trace the Full Path​

3. Test the User's Interface​

4. Validate Against Reality​

5. Check the Seams​

6. What's Missing?​

When to Apply​

Integration with Hancock​

Background​