Rectifies Methodology

v1.0 · Last updated 2026-05-12By Nathan Williams
Rectifies measures how often a brand or domain is cited by AI search engines in response to a defined set of prompts, using mixed-effects logistic regression with bootstrap confidence intervals.

What Rectifies measures

Rectifies measures citation share: the proportion of times a domain or brand is cited by an AI search engine in response to a defined prompt, relative to all domains cited for the same prompt class.

We do not measure "AI visibility scores," "brand sentiment," or "AI readiness." These terms describe products that produce a single number without statistical backing. Citation share is a proportion with a known denominator, a confidence interval, and a reproducible measurement procedure.

Panel composition

The Rectifies panel consists of prompts probed daily across five AI search engines:

  • ChatGPT (via OpenAI API, gpt-4o model with web browsing)
  • Perplexity (via Perplexity API, sonar model)
  • Claude (grounded mode via Anthropic API with tool-use web search)
  • Claude (baseline mode — no grounding, measuring training-data citation)
  • Google AI Mode (via Gemini API with Google Search grounding)

Each customer has 20–500 prompts, probed once daily per engine. The daily slim panel (design partners) covers 60 prompts × 5 engines = 300 observations per day.

Statistical methods

Primary metric: citation share

Citation share for domain d on engine e over period t is defined as:

CS(d, e, t) = (citations of d by e in t) / (total citations by e in t)

Confidence intervals

All published metrics carry 95% bootstrap confidence intervals (BCa method, 10,000 resamples). We report the interval, not just the point estimate.

Regression model

For attribution analysis, we fit a mixed-effects logistic regression:

logit(P(cited)) = β₀ + β₁·treatment + β₂·engine + β₃·week + (1|prompt_class) + (1|engine)

Random effects on prompt class and engine account for the non-independence of repeated measurements within prompt clusters and engine-specific citation behaviour.

Temporal stability and drift

ChatGPT rotates 74% of cited domains weekly (SISTRIX, 82,619 prompts, 17 weeks). A single probe is noise. Our minimum reporting cadence is 8 weeks — shorter measurement windows produce confidence intervals too wide to be actionable.

What we do not measure (and why)

We deliberately exclude:

  • "AI visibility scores" — composite numbers without defined denominators are not statistical measures
  • Brand sentiment — subjective annotation with inter-rater reliability below acceptable thresholds
  • "AI readiness" assessments — unfalsifiable product marketing
  • Single-run citation checks — week-over-week variance exceeds 40%; a single run is not measurement

Known limitations

  1. Engine API behaviour may diverge from consumer-facing product. We probe via API; users interact via chat UI. Grounding and citation behaviour may differ.
  2. Panel prompts are not exhaustive. We measure a sample, not the universe of possible queries.
  3. Citation ≠ recommendation. Being cited is not the same as being recommended. We measure presence, not endorsement.
  4. Temporal lag. Daily probing detects changes within 24 hours, but attribution analysis requires 8+ weeks of data for statistical power.

Versioning and changelog

This methodology is versioned. The current version is v1.0. All changes are logged at /changelog with:

  • What changed
  • Why it changed
  • Whether it affects historical comparability
  • Migration guidance for customers comparing pre- and post-change data

How to cite this work

Williams, N. (2026). Rectifies Methodology v1.0. rectifies.io. Available at: https://rectifies.io/methodology