Rectifies Methodology

v1.0 · Last updated 2026-05-12By Nathan Williams

Rectifies measures how often a brand or domain is cited by AI search engines in response to a defined set of prompts, using mixed-effects logistic regression with bootstrap confidence intervals.

What Rectifies measures

Rectifies measures citation share: the proportion of times a domain or brand is cited by an AI search engine in response to a defined prompt, relative to all domains cited for the same prompt class.

We do not measure "AI visibility scores," "brand sentiment," or "AI readiness." These terms describe products that produce a single number without statistical backing. Citation share is a proportion with a known denominator, a confidence interval, and a reproducible measurement procedure.

Panel composition

The Rectifies panel consists of prompts probed daily across five AI search engines:

ChatGPT (via OpenAI API, gpt-4o model with web browsing)
Perplexity (via Perplexity API, sonar model)
Claude (grounded mode via Anthropic API with tool-use web search)
Claude (baseline mode — no grounding, measuring training-data citation)
Google AI Mode (via Gemini API with Google Search grounding)

Each customer has 20–500 prompts, probed once daily per engine. The daily slim panel (design partners) covers 60 prompts × 5 engines = 300 observations per day.

Statistical methods

Primary metric: citation share

Citation share for domain d on engine e over period t is defined as:

CS(d, e, t) = (citations of d by e in t) / (total citations by e in t)

Confidence intervals

All published metrics carry 95% bootstrap confidence intervals (BCa method, 10,000 resamples). We report the interval, not just the point estimate.

Regression model

For attribution analysis, we fit a mixed-effects logistic regression:

logit(P(cited)) = β₀ + β₁·treatment + β₂·engine + β₃·week + (1|prompt_class) + (1|engine)

Random effects on prompt class and engine account for the non-independence of repeated measurements within prompt clusters and engine-specific citation behaviour.

Temporal stability and drift

ChatGPT rotates 74% of cited domains weekly (SISTRIX, 82,619 prompts, 17 weeks). A single probe is noise. Our minimum reporting cadence is 8 weeks — shorter measurement windows produce confidence intervals too wide to be actionable.

What we do not measure (and why)

We deliberately exclude:

"AI visibility scores" — composite numbers without defined denominators are not statistical measures
Brand sentiment — subjective annotation with inter-rater reliability below acceptable thresholds
"AI readiness" assessments — unfalsifiable product marketing
Single-run citation checks — week-over-week variance exceeds 40%; a single run is not measurement

Known limitations

Engine API behaviour may diverge from consumer-facing product. We probe via API; users interact via chat UI. Grounding and citation behaviour may differ.
Panel prompts are not exhaustive. We measure a sample, not the universe of possible queries.
Citation ≠ recommendation. Being cited is not the same as being recommended. We measure presence, not endorsement.
Temporal lag. Daily probing detects changes within 24 hours, but attribution analysis requires 8+ weeks of data for statistical power.

Versioning and changelog

This methodology is versioned. The current version is v1.0. All changes are logged at /changelog with:

What changed
Why it changed
Whether it affects historical comparability
Migration guidance for customers comparing pre- and post-change data

How to cite this work

Williams, N. (2026). Rectifies Methodology v1.0. rectifies.io. Available at: https://rectifies.io/methodology