Report Schema

The scan report is a structured JSON document organized for action. Understand what each section means so you can turn the data into specific SEO improvements.

Report sections

Standard and Deep customer_v1 reports share the core sections below; the structured_data_benchmarks, structured_data_coverage, and serp_speed_benchmark rows are Deep-only, and the originality row appears on Standard and Deep scans when the analysis completes. The schema_version field is always onpage-report-customer-v1.

meta

Report date, target keyword, location, and URL.

report_datetarget_keywordlocationurl

on_page_optimization

Final On-Page Optimization Score, grade, numeric confidence, summary, and high-level focus areas.

scoregradeconfidencesummaryfocus_areas

benchmarks

Page-1 averages vs your page, side by side.

page1_averageyour_url

entity_coverage

Entity and term coverage, including related-entity density versus competitors — the core of most optimization work.

your_url_related_entity_density_scorecompetitor_related_entity_density_scorenatural_language_entitieshighly_related_termskeyword_variationsrelated_category_entitiesspecific_category_entities

topic_and_classification

Topic classification, swipe content, and authority questions.

page_classificationswipe_contenttopical_authority_questions

internal_linking

Source pages that should link to the analyzed URL.

add_internal_links_fromto_your_url

competitor_term_coverage

Term-by-term comparison against ranking competitors.

domainsterms

structured_data_benchmarks

Standard and Deep scans only. Page-1 structured data totals vs your page, side by side.

page1_averageyour_url

structured_data_coverage

Standard and Deep scans only. Compact schema type prevalence across the accepted ranking competitor cohort.

competitor_counttop_competitor_countschema_types

serp_speed_benchmark

Deep scans only. Self-hosted page-experience benchmark of the target URL against the top 3 organic competitor URLs in the same SERP. Optional — Deep responses include the field when the benchmark payload was produced; when present, `status` indicates whether the run completed (`ok`) or was short-circuited (`disabled`, `skipped_*`, `timeout`). Lite and Standard never include it.

statuswhat_this_measuresmeasurement_typedevice_profilenetwork_profilecpu_profilebenchmark_versionweb_vitals_versiontargetcompetitors

originality

Standard and Deep scans only. Embedding-based originality and information-gain analysis of the page against the live ranking cohort. Optional — present when the analysis completed; omitted (never null) when it didn't run (too few competitor pages, too little scoreable text, or time budget exhausted). Lite scans never include it. See the detailed field table below.

scoregradesentence_analysisduplicative_passagesinformation_gainchart

link_opportunities

Standard and Deep scans only. Rates every ranking SERP listing as a link or citation opportunity: the kind of page it is, how fresh it is, whether it competes with your page, the realistic action to earn a link or mention there (reply, pitch inclusion, create a listing, outreach, or none), and 0-100 scores for attainability, willingness, SEO link value, and AI citation value. Optional; present when the feature is enabled and the analysis completed, omitted (never null) otherwise. Lite scans never include it.

algorithm_versionkeyworditems

How to use the report for SEO

The report is most useful as a prioritization engine. Don't try to act on everything at once — focus on the highest-impact gaps first.

Optimization score — start with the benchmark

on_page_optimization gives the final score, numeric confidence, and focus areas for the page. Use it as the top-level benchmark before drilling into the supporting sections.

Entity coverage — find missing topics

Look at natural_language_entities with coverage_status of "missing". Start with the highest-importance terms. Edit existing sentences to include them rather than adding new paragraphs.

Related entity density — judge depth, not just presence

Compare your_url_related_entity_density_score with competitor_related_entity_density_score to see whether important related entities appear with enough depth for the page length.

Authority questions — close topical gaps

The who/what/where/how questions show what angles your page should address. Add the ones that fit your page intent — don't force irrelevant angles.

Internal linking — strengthen your page

add_internal_links_from gives you specific source pages and anchor text suggestions. These are pages on YOUR site that should link to the analyzed page.

Competitor coverage — benchmark yourself

Compare domains and terms to see which topics the top-ranking pages all cover that you don't. Patterns across multiple competitors are stronger signals than individual outliers.

Benchmarks — quantify the gap

page1_average vs your_url gives you numerical scores across the same metrics. Use this to prioritize which metrics have the largest gaps.

Structured data — compare schema coverage

structured_data_benchmarks and structured_data_coverage show structured-data totals and compact schema.org type prevalence across the accepted ranking competitor cohort.

SERP speed (Deep only) — compare page experience

serp_speed_benchmark.target vs serp_speed_benchmark.competitors shows LCP, CLS, approximate TBT, and TTFB side-by-side with the top 3 organic competitors in the same SERP (FCP is captured at the per-probe level too). Recommend page-experience fixes only where the target is materially worse than the competitor median — skip ties and per-probe statuses other than `ok`.

Originality (Standard/Deep) — add information, not just coverage

Rewrite duplicative_passages first — each one names the competitor domain it matches. Cover content_most_competitors_have briefly (it's table stakes), answer an uncovered topic question that fits the page intent, and close the unique_data_points gap with data only you have. Re-scan to verify the score moved.

The `originality` section

Standard and Deep scans run an embedding-based originality and information-gain analysis: the page's sentences are compared against the pooled sentences of the live ranking cohort, classified as original, shared, or duplicative, and the gaps are extracted as actionable evidence. The section is optional — when the analysis can't run (too few competitor pages fetched, too little scoreable text, or the time budget is exhausted) the field is omitted entirely, never emitted as null. Lite scans never include it. You can try it on any URL with the free Information Gain Checker.

Field	Type	Description
`score`	integer 0–100	Information Gain Score: the percentage of the page's scored sentences with no close semantic equivalent anywhere in the ranking cohort.
`grade`	string enum	One of "Highly original" (score 70+), "Moderately original" (40–69), or "Mostly shared" (below 40).
`sentence_analysis`	object	Sentence bucket counts: original (no close cohort equivalent), shared (similar ground covered), duplicative (near-equivalent exists), and total_scored.
`duplicative_passages[]`	array (up to 3)	The page's sentences that most closely match a competitor: snippet plus the matched_domain it matched. First candidates for a rewrite.
`information_gain.content_most_competitors_have[]`	array (up to 5)	Content clusters shared across multiple competitor domains that the page does not cover: snippet plus competitor_count.
`information_gain.potential_uncovered_topics_for_information_gain[]`	string array (up to 5)	Topic questions almost no competitor answers and the page doesn't answer either — whitespace where added information is cheapest.
`information_gain.unique_data_points`	object	Numeric data points unique to the page: your_count, page1_average (number or null), and examples (up to 8 tokens).
`chart`	string \| null	Preformatted text chart (score gauge + sentence buckets) for terminal/agent display. Render as-is in monospace.

Example payload (illustrative values — no real domains):

"originality": {
  "score": 62,
  "grade": "Moderately original",
  "sentence_analysis": {
    "original": 58,
    "shared": 29,
    "duplicative": 7,
    "total_scored": 94
  },
  "duplicative_passages": [
    {
      "snippet": "An opening definition that restates the consensus answer…",
      "matched_domain": "competitor-example.com"
    }
  ],
  "information_gain": {
    "content_most_competitors_have": [
      {
        "snippet": "A basic checklist most ranking pages include…",
        "competitor_count": 6
      }
    ],
    "potential_uncovered_topics_for_information_gain": [
      "What does this cost at different team sizes?",
      "How do results differ for non-English content?"
    ],
    "unique_data_points": {
      "your_count": 4,
      "page1_average": 9.5,
      "examples": ["37%", "4.2:1", "120ms"]
    }
  },
  "chart": "Originality: 62/100  Grade: Moderately original\n…"
}

Legacy field mapping

If you're migrating from the older report format, here's how the section names map to the current schema.

Legacy name	Current path
Date / TargetKeyword / Location / URL	`meta`
OnPageOptimizationScore	`on_page_optimization`
Page1AverageVsYourUrl	`benchmarks`
NaturalLanguageAnalysis	`entity_coverage.natural_language_entities`
HighlyRelatedWords	`entity_coverage.highly_related_terms`
KeywordVariations	`entity_coverage.keyword_variations`
RelatedCategory	`entity_coverage.related_category_entities`
SpecificCategoryEntities	`entity_coverage.specific_category_entities`
PageClassification	`topic_and_classification.page_classification`
SwipeContent	`topic_and_classification.swipe_content`
TopicalAuthorityQuestions	`topic_and_classification.topical_authority_questions`
Internal Link Recommendations	`internal_linking`
CompetitorAnalysis	`competitor_term_coverage`
StructuredDataBenchmarks	`structured_data_benchmarks`
StructuredDataCoverage	`structured_data_coverage`
Originality	`originality`
LinkOpportunities	`link_opportunities`

Machine-readable schemas

Two JSON Schemas are published as MCP resources, one per tier:

schema://customer-report-v1 — full shape for Standard and Deep scans (response_format=customer_v1). Carries benchmarks, entity coverage including related-entity density, topic and classification, internal linking, and competitor term coverage. Standard and Deep scans may also carry the originality section, and Deep scans the structured data benchmark and coverage sections.
schema://customer-report-v1-lite — reduced shape for Lite scans (response_format=customer_v1_lite). Keeps benchmarks, entity coverage including related-entity density (natural language entities, highly related terms, keyword variations only), and competitor term coverage. Omits topic_and_classification and internal_linking entirely — those are not computed for Lite scans — and never emits originality.

The service picks the right schema automatically based on the job's depth. Explicitly requesting customer_v1 on a Lite job (or customer_v1_lite on a Standard/Deep job) returns a 409 UNSUPPORTED_FORMAT — omit the parameter or use the matching value.

Report Schema

Report sections

How to use the report for SEO

The originality section

Legacy field mapping

The `originality` section