Skip to content

Research

Methodology

How we scanned 97,304 EU websites — sample source, scanner architecture, infrastructure, and limitations.

Scan Dataset

  • Scan window: April 7–9, 2026
  • URLs submitted: 114,748
  • Successful scans: 97,304 (84.8%)
  • Failed scans: 17,444 (site unreachable, TLS errors, timeouts)
  • EU countries covered: 25 (all EU member states with ccTLD presence in Tranco)

Sample Source

  • Domain list: Tranco Top 1M (research-grade domain ranking, snapshot L7684)
  • Selection method: Filtered by 25 EU country-code TLDs, round-robin by country to prevent large-market bias
  • Country distribution: Germany 22,696 / France 8,463 / Netherlands 8,752 / Italy 8,598 / Poland 7,881 / Spain 4,735 + 19 other EU countries (full breakdown in the report)

Scanner

  • Engine: Go-based scanner using headless Chromium via Chrome DevTools Protocol (CDP)
  • Browser: Chromium (headless, sandboxed, clean profile per scan)
  • Viewport: Desktop 1920×1080
  • Scan location: Hetzner Cloud, Falkenstein, Germany (EU)
  • CMP detection: 45 consent management platforms recognized via script signatures, DOM selectors, and JavaScript API probing
  • Selector version: 2026-04-06-v23

Infrastructure

  • App server: Hetzner CX-class VPS (web application, PostgreSQL, Redis)
  • Worker servers: 2 Hetzner VPS instances (15 concurrent scan slots total)
  • DNS cache: Local Unbound resolver with periodic pre-warming
  • Total runtime: ~60 hours for the full corpus

What Each Scan Measures

  • Pre-consent snapshot: All cookies, tracking services, and third-party connections set before any user interaction
  • Banner detection: Consent management platform identification, accept/reject button presence
  • Accept flow: Click Accept, record post-consent cookies and trackers
  • Reject flow: Fresh browser, click Reject, reload page, verify tracking stops
  • Risk scoring: 0–100 composite score based on 27 distinct compliance findings
  • Limitations

    • Desktop only. Mobile viewport behavior may differ.
    • Single scan location. Geo-targeted consent banners may behave differently from non-EU locations.
    • Point-in-time observation. Websites change after scanning. Results represent the state during the scan window.
    • Bot detection. Some sites detect automated browsers and may alter behavior (we detect and report this when identified).
    • CMP coverage. 45 CMPs recognized; sites using unrecognized or custom consent solutions fall back to generic heuristic detection.
    • No legal interpretation. Risk scores are technical indicators, not legal compliance determinations.

    Data Access

    • Aggregate statistics are published in our blog posts and research report.
    • Individual domain classifications are not published. Site operators can request their scan result by running a free scan or contacting us.
    • Correction requests: If you believe your site's classification is incorrect, run a new scan or contact [email protected].

    Reproducibility

    The scanner is proprietary software. The aggregate dataset is provided for verification of published statistics. The Tranco domain list used is publicly available at tranco-list.eu.


    Read the Findings

    See what we found across 97,304 EU websites.

    Read the full report