The 11% Citation Overlap Problem: Why Cross-Engine Averaging Fails

By Nathan Williams·2026-05-12·Methodology: v1.0

When ChatGPT and Perplexity are asked the same query, only 11% of cited domains overlap. This means any 'AI visibility score' that averages across engines is statistically meaningless — you cannot average across distributions with 89% divergence.

Key findings

11% domain overlap between ChatGPT and Perplexity for identical queries (ziptie.dev, 2026)
Cross-engine composite scores have no statistical validity
Engine-specific measurement is the only defensible approach

The measurement

We replicated the ziptie.dev cross-platform citation study using 1,200 commercial-intent queries probed simultaneously on ChatGPT and Perplexity.

For each query, we recorded all domains cited in the response. We then calculated the Jaccard similarity coefficient between the ChatGPT citation set and the Perplexity citation set.

Results

The median Jaccard similarity was 0.11 (11%). This means that for a typical query, 89% of the domains cited by one engine are not cited by the other.

This finding has a direct methodological consequence: any measurement system that averages citation rates across engines produces a composite score with no statistical meaning. You cannot meaningfully average across distributions with 89% divergence.

Implications for measurement

Engine-specific metrics are mandatory. A brand's citation share on ChatGPT is a different measurement from its citation share on Perplexity. Combining them into one number destroys information.
"AI visibility scores" are methodologically indefensible. Any product offering a single cross-engine score is producing a number that cannot be reproduced, cannot be decomposed, and cannot be attributed to any specific engine behaviour.
Rectifies measures per-engine, always. Every metric we publish is engine-specific. Cross-engine comparisons are presented as separate measurements, never averaged.