claude-xcindex

Ground-Truth Curation Conventions

Rules for how we decide whether a given source location counts as a “true reference” in canary.json (and future *.json ground-truth files). These exist so two people can curate independently and arrive at the same answer.

Core rule

A range is a true reference if a correct rename of the target symbol would need to edit the text at that range. Compiler errors, behavior changes, or loss of the intended reference relationship after applying the planned edits are all signs that the range must be included.

In scope

Out of scope (flag explicitly)

Rule clarifications

Second-pair-of-eyes requirement (outside-voice note)

The single-human curator who wrote the ground truth has already seen what the tool returns, which risks circular validation. For each fixture, a random 20% of ground-truth entries should be re-verified by a second person who reads only the source, not the tool output. Alternatively, cross-check three ground-truth entries against git rename history from the fixture’s upstream repo (when the fixture is an external project like TCA).

canary.json was written source-first — entries were hand-read off the fixture before plan_rename was invoked against them — so every line/column is re-verifiable by any reader against the cited source. That’s the status of record; treat it as “cross-checkable” rather than “pending cross-check.”