The SEO Informatica AI Visibility Benchmark measures whether service-business websites are discoverable, understandable, citable, and routable across search and AI-answer environments.

The 2026.06 release is based on a semantically reviewed anonymized sample of 50 service-business websites collected on June 3, 2026 UTC / June 4, 2026 IST. The result is blunt: most sampled pages were technically accessible, but weak as citation-ready source assets.

This is not a market-wide census. It is a reviewed benchmark sample designed to expose the gap between crawlability and citation-worthiness.

Benchmark Summary

Field	Value
Version	2026.06
Status	Published 2026.06 benchmark release.
Sample size	50 reviewed service-business websites
Unique anonymized domains	50
Collection dates	June 3, 2026 UTC / June 4, 2026 IST
Public domains	Withheld; `domain_public=false` for all records
Model	DUCR: Discoverable, Understandable, Citable, Routable
Dataset	/ai-visibility-benchmark/dataset/
Methodology	/ai-visibility-benchmark/methodology/
Limitations	/ai-visibility-benchmark/limitations/

Executive Findings

All 50 audited primary service pages returned HTTP 200 and allowed Googlebot, Bingbot, OAI-SearchBot, Claude-SearchBot, and PerplexityBot in robots checks.
The median DUCR score was 52.5/100. The average DUCR score was 53.36/100.
Citable-source readiness was the weakest layer by far, with a median citable score of 4/30.
No sampled page had a critical blocker after semantic review.
Only 10 of 50 pages showed methodology signals, 6 of 50 showed limitations, 5 of 50 offered dataset/download signals, and 3 of 50 used at least one table.

DUCR Score Summary

DUCR Layer	Max Points	Median	Average	Main Finding
Discoverable	25	19.0	19.54	Crawl and index access were mostly present.
Understandable	25	15.0	14.38	Entity and page-role clarity were uneven.
Citable	30	4.0	5.90	Source-quality signals were badly underbuilt.
Routable	20	13.0	13.54	CTAs existed, but proof-to-next-step routes were inconsistent.
Total	100	52.5	53.36	Most pages sat in the middle: accessible, understandable enough, but not strong sources.

Score Distribution

DUCR Total Band	Site Count
0-39	2
40-49	14
50-59	22
60-69	11
70-79	1
80-100	0

Citable Score Band	Site Count
0-5 / 30	32
6-10 / 30	9
11-15 / 30	8
16-20 / 30	1
21-30 / 30	0

What This Means

The sample does not show an access crisis. It shows a source-worthiness crisis.

A page can be crawlable, indexable, and technically available to AI-search crawlers while still being a weak citation candidate. In this sample, the common gap was not robots.txt. It was thin evidence, no methodology, no limitations, no downloadable source material, no author or review signals, and too few self-contained answer blocks.

That matters because AI answer systems and search surfaces need more than a sales page. They need a page that can be parsed, attributed, checked, summarized, and routed.

Crawler Access Matrix

Crawler/User Agent	Allowed Count	Allowed Percent	Why It Is Tracked
Googlebot	50/50	100%	Google Search crawling and indexing eligibility.
Bingbot	50/50	100%	Bing and Microsoft search discovery.
OAI-SearchBot	50/50	100%	ChatGPT Search surfacing and citation eligibility.
GPTBot	49/50	98%	OpenAI training crawler; tracked separately from search.
ChatGPT-User	49/50	98%	User-directed ChatGPT page fetches.
ClaudeBot	50/50	100%	Anthropic training crawler.
Claude-SearchBot	50/50	100%	Claude search retrieval.
Claude-User	50/50	100%	User-directed Claude fetches.
PerplexityBot	50/50	100%	Perplexity search/indexing surfacing.
Perplexity-User	50/50	100%	User-directed Perplexity fetches.

Full access study: /ai-visibility-benchmark/ai-crawler-access-study/

Weakest Source Signals

Signal	Count	Percent
Methodology present	10/50	20%
Limitations present	6/50	12%
Dataset/download present	5/50	10%
Table present	3/50	6%
Original data present	0/50	0%
Author name present	0/50	0%
Visible date modified	0/50	0%

These are the signals that turn a page from generic service copy into a usable source.

Vertical Mix

Vertical	Count	Median DUCR
Consulting	16	49.0
Accounting	10	51.5
Agency	9	57.0
Home services	5	62.0
Legal	3	61.0
Dental	2	55.5
Pest control	1	64.0
Junk removal	1	57.0
Roofing	1	56.0
Wellness	1	50.0
Med spa	1	49.0

Vertical comparisons with fewer than three records should be treated as descriptive only. They are not reliable category-level findings.

What This Benchmark Measures

DUCR Layer	What It Measures
Discoverable	Crawl access, index/snippet eligibility, sitemap and internal-link access, canonical clarity, and visible HTML.
Understandable	Entity clarity, page-role clarity, semantic heading structure, schema alignment, and audience/service context.
Citable	Original evidence, methodology, official source support, answer blocks, tables, downloads, author/date/version, and limitations.
Routable	Whether source pages route readers and crawlers to the right service, proof, contact, and tracking paths.

Full scoring reference: /ai-visibility-benchmark/ducr-score/

What This Benchmark Does Not Measure

This benchmark cannot prove stable "LLM rankings." It cannot guarantee Google AI Overview, ChatGPT, Claude, Perplexity, or Copilot citations. It does not estimate AI traffic without analytics evidence. It does not claim that one prompt test represents platform-wide visibility.

Full limitation notes: /ai-visibility-benchmark/limitations/

Dataset Downloads

Reviewed CSV: /downloads/ai-visibility-benchmark-2026-06-reviewed.csv
Reviewed JSON: /downloads/ai-visibility-benchmark-2026-06-reviewed.json
Summary JSON: /downloads/ai-visibility-benchmark-summary-2026-06.json
Benchmark stats JSON: /downloads/benchmark-stats-2026-06.json
Codebook: /downloads/ai-visibility-benchmark-codebook-2026-06.md
DUCR rubric JSON: /downloads/ducr-scoring-rubric-2026-06.json
AI crawler access checklist: /downloads/ai-crawler-access-checklist-2026-06.csv
Monthly citation log template: /downloads/monthly-ai-citation-log-template-2026-06.csv

Dataset page: /ai-visibility-benchmark/dataset/

Source Evidence Matrix

Source Type	How It Is Used
Official platform guidance	Used for crawler, indexing, snippet, structured-data, and AI-search measurement requirements.
Benchmark dataset	Used for every public statistic and score distribution.
Semantic review notes	Used to exclude dirty records and label reviewed rows.
SEO Informatica inference	Clearly labeled when recommendations are inferred from official guidance and observed data.

Reference sources include OpenAI crawler documentation, Google Search Central robots and robots meta documentation, Anthropic crawler documentation, and Perplexity robots.txt documentation.

Version History

Version	Date	Notes
2026.06	June 4, 2026 IST	Initial reviewed 50-site benchmark package prepared from semantic-reviewed dataset.

Next Step

After the benchmark evidence has been presented, the right commercial route is a DUCR baseline review for a service-business website.

Get a DUCR baseline for your service-business website

AI Visibility Benchmark for Service-Business Websites: DUCR Score, Dataset, and Methodology