The SEO Informatica AI Visibility Benchmark measures whether service-business websites are discoverable, understandable, citable, and routable across search and AI-answer environments.
The 2026.06 release is based on a semantically reviewed anonymized sample of 50 service-business websites collected on June 3, 2026 UTC / June 4, 2026 IST. The result is blunt: most sampled pages were technically accessible, but weak as citation-ready source assets.
This is not a market-wide census. It is a reviewed benchmark sample designed to expose the gap between crawlability and citation-worthiness.
Benchmark Summary
| Field | Value |
|---|---|
| Version | 2026.06 |
| Status | Published 2026.06 benchmark release. |
| Sample size | 50 reviewed service-business websites |
| Unique anonymized domains | 50 |
| Collection dates | June 3, 2026 UTC / June 4, 2026 IST |
| Public domains | Withheld; domain_public=false for all records |
| Model | DUCR: Discoverable, Understandable, Citable, Routable |
| Dataset | /ai-visibility-benchmark/dataset/ |
| Methodology | /ai-visibility-benchmark/methodology/ |
| Limitations | /ai-visibility-benchmark/limitations/ |
Executive Findings
- All 50 audited primary service pages returned HTTP 200 and allowed Googlebot, Bingbot, OAI-SearchBot, Claude-SearchBot, and PerplexityBot in robots checks.
- The median DUCR score was 52.5/100. The average DUCR score was 53.36/100.
- Citable-source readiness was the weakest layer by far, with a median citable score of 4/30.
- No sampled page had a critical blocker after semantic review.
- Only 10 of 50 pages showed methodology signals, 6 of 50 showed limitations, 5 of 50 offered dataset/download signals, and 3 of 50 used at least one table.
DUCR Score Summary
| DUCR Layer | Max Points | Median | Average | Main Finding |
|---|---|---|---|---|
| Discoverable | 25 | 19.0 | 19.54 | Crawl and index access were mostly present. |
| Understandable | 25 | 15.0 | 14.38 | Entity and page-role clarity were uneven. |
| Citable | 30 | 4.0 | 5.90 | Source-quality signals were badly underbuilt. |
| Routable | 20 | 13.0 | 13.54 | CTAs existed, but proof-to-next-step routes were inconsistent. |
| Total | 100 | 52.5 | 53.36 | Most pages sat in the middle: accessible, understandable enough, but not strong sources. |
Score Distribution
| DUCR Total Band | Site Count |
|---|---|
| 0-39 | 2 |
| 40-49 | 14 |
| 50-59 | 22 |
| 60-69 | 11 |
| 70-79 | 1 |
| 80-100 | 0 |
| Citable Score Band | Site Count |
|---|---|
| 0-5 / 30 | 32 |
| 6-10 / 30 | 9 |
| 11-15 / 30 | 8 |
| 16-20 / 30 | 1 |
| 21-30 / 30 | 0 |
What This Means
The sample does not show an access crisis. It shows a source-worthiness crisis.
A page can be crawlable, indexable, and technically available to AI-search crawlers while still being a weak citation candidate. In this sample, the common gap was not robots.txt. It was thin evidence, no methodology, no limitations, no downloadable source material, no author or review signals, and too few self-contained answer blocks.
That matters because AI answer systems and search surfaces need more than a sales page. They need a page that can be parsed, attributed, checked, summarized, and routed.
Crawler Access Matrix
| Crawler/User Agent | Allowed Count | Allowed Percent | Why It Is Tracked |
|---|---|---|---|
| Googlebot | 50/50 | 100% | Google Search crawling and indexing eligibility. |
| Bingbot | 50/50 | 100% | Bing and Microsoft search discovery. |
| OAI-SearchBot | 50/50 | 100% | ChatGPT Search surfacing and citation eligibility. |
| GPTBot | 49/50 | 98% | OpenAI training crawler; tracked separately from search. |
| ChatGPT-User | 49/50 | 98% | User-directed ChatGPT page fetches. |
| ClaudeBot | 50/50 | 100% | Anthropic training crawler. |
| Claude-SearchBot | 50/50 | 100% | Claude search retrieval. |
| Claude-User | 50/50 | 100% | User-directed Claude fetches. |
| PerplexityBot | 50/50 | 100% | Perplexity search/indexing surfacing. |
| Perplexity-User | 50/50 | 100% | User-directed Perplexity fetches. |
Full access study: /ai-visibility-benchmark/ai-crawler-access-study/
Weakest Source Signals
| Signal | Count | Percent |
|---|---|---|
| Methodology present | 10/50 | 20% |
| Limitations present | 6/50 | 12% |
| Dataset/download present | 5/50 | 10% |
| Table present | 3/50 | 6% |
| Original data present | 0/50 | 0% |
| Author name present | 0/50 | 0% |
| Visible date modified | 0/50 | 0% |
These are the signals that turn a page from generic service copy into a usable source.
Vertical Mix
| Vertical | Count | Median DUCR |
|---|---|---|
| Consulting | 16 | 49.0 |
| Accounting | 10 | 51.5 |
| Agency | 9 | 57.0 |
| Home services | 5 | 62.0 |
| Legal | 3 | 61.0 |
| Dental | 2 | 55.5 |
| Pest control | 1 | 64.0 |
| Junk removal | 1 | 57.0 |
| Roofing | 1 | 56.0 |
| Wellness | 1 | 50.0 |
| Med spa | 1 | 49.0 |
Vertical comparisons with fewer than three records should be treated as descriptive only. They are not reliable category-level findings.
What This Benchmark Measures
| DUCR Layer | What It Measures |
|---|---|
| Discoverable | Crawl access, index/snippet eligibility, sitemap and internal-link access, canonical clarity, and visible HTML. |
| Understandable | Entity clarity, page-role clarity, semantic heading structure, schema alignment, and audience/service context. |
| Citable | Original evidence, methodology, official source support, answer blocks, tables, downloads, author/date/version, and limitations. |
| Routable | Whether source pages route readers and crawlers to the right service, proof, contact, and tracking paths. |
Full scoring reference: /ai-visibility-benchmark/ducr-score/
What This Benchmark Does Not Measure
This benchmark cannot prove stable "LLM rankings." It cannot guarantee Google AI Overview, ChatGPT, Claude, Perplexity, or Copilot citations. It does not estimate AI traffic without analytics evidence. It does not claim that one prompt test represents platform-wide visibility.
Full limitation notes: /ai-visibility-benchmark/limitations/
Dataset Downloads
- Reviewed CSV: /downloads/ai-visibility-benchmark-2026-06-reviewed.csv
- Reviewed JSON: /downloads/ai-visibility-benchmark-2026-06-reviewed.json
- Summary JSON: /downloads/ai-visibility-benchmark-summary-2026-06.json
- Benchmark stats JSON: /downloads/benchmark-stats-2026-06.json
- Codebook: /downloads/ai-visibility-benchmark-codebook-2026-06.md
- DUCR rubric JSON: /downloads/ducr-scoring-rubric-2026-06.json
- AI crawler access checklist: /downloads/ai-crawler-access-checklist-2026-06.csv
- Monthly citation log template: /downloads/monthly-ai-citation-log-template-2026-06.csv
Dataset page: /ai-visibility-benchmark/dataset/
Source Evidence Matrix
| Source Type | How It Is Used |
|---|---|
| Official platform guidance | Used for crawler, indexing, snippet, structured-data, and AI-search measurement requirements. |
| Benchmark dataset | Used for every public statistic and score distribution. |
| Semantic review notes | Used to exclude dirty records and label reviewed rows. |
| SEO Informatica inference | Clearly labeled when recommendations are inferred from official guidance and observed data. |
Reference sources include OpenAI crawler documentation, Google Search Central robots and robots meta documentation, Anthropic crawler documentation, and Perplexity robots.txt documentation.
Recommended Reading
- /ai-visibility-benchmark/crawlable-not-citable-service-pages/
- /ai-visibility-benchmark/citation-worthy-service-pages/
- /ai-visibility-benchmark/answer-source-format/
- /ai-visibility-benchmark/monthly-ai-citation-tracking/
- /search-ai-visibility/
Version History
| Version | Date | Notes |
|---|---|---|
| 2026.06 | June 4, 2026 IST | Initial reviewed 50-site benchmark package prepared from semantic-reviewed dataset. |
Next Step
After the benchmark evidence has been presented, the right commercial route is a DUCR baseline review for a service-business website.