SEOInformatica
SEO Informatica
SEOInformatica
SEO Informatica

Service-Business AI Visibility Benchmark Dataset

This page hosts the anonymized reviewed dataset for the SEO Informatica AI Visibility Benchmark 2026.06.

The dataset is designed to support benchmark claims about public service-business page readiness. It is not designed to expose the audited domains, publish private company findings, or claim market-wide averages.

Dataset Summary

Field Value
Dataset name SEO Informatica AI Visibility Benchmark Dataset
Version 2026.06
Run ID semantic-reviewed-50-2026-06-04
Sample size 50 reviewed records
Unique anonymized domains 50
Collection period June 3, 2026 UTC / June 4, 2026 IST
Domain publication Domains and URLs withheld; all records use domain_public=false
Proposed public license Attribution required to SEO Informatica; confirm final license before live publication
Reviewed CSV download /downloads/ai-visibility-benchmark-2026-06-reviewed.csv
Reviewed JSON download /downloads/ai-visibility-benchmark-2026-06-reviewed.json
Codebook /downloads/ai-visibility-benchmark-codebook-2026-06.md
Rubric /downloads/ducr-scoring-rubric-2026-06.json

What Is Public

The public dataset contains anonymized site IDs, domain hashes, verticals, page-quality fields, crawler-access fields, index/snippet fields, schema fields, DUCR scores, review status, and provenance fields.

The public dataset does not expose live domains, live URLs, private analytics, CRM data, server logs, client names, or owner-identifying review notes.

Field Groups

  • Site and sample metadata
  • Audited page URL fields with public values withheld
  • Crawl access fields
  • Index and snippet fields
  • Page-structure fields
  • Entity and source-clarity fields
  • Schema fields
  • DUCR scoring fields
  • Manual semantic review status
  • Provenance fields

Review Status

Review Status Count
semantic_review_approved 22
semantic_review_approved_with_caveat 28

Rows marked with caveats were still accepted into the benchmark after semantic review, but the caveat label should remain visible so future readers understand that the sample was manually checked, not blindly accepted from automation.

Downloads

How To Cite The Dataset

Use the dataset version and date, not a vague page title.

Recommended citation format:

SEO Informatica. "Service-Business AI Visibility Benchmark Dataset." Version 2026.06. Collected June 3, 2026 UTC / June 4, 2026 IST. Reviewed anonymized sample of 50 service-business websites.

Known Gaps

This dataset uses public-page audits. It cannot measure private analytics, CRM outcomes, hidden platform behavior, personalized AI answers, or guaranteed citation probability.

The sample is also vertically uneven. Consulting, accounting, and agency sites represent 35 of 50 rows, so vertical-level comparisons should be treated as descriptive unless the count is high enough to support stronger analysis.

Related Pages