Chapter 69: Privacy, Security, and Data Capture

A metadata runtime can make applications safer, but only if it has strong defaults.

The Kit architecture’s metadata protocol (Chapter 22) and capability modules (Chapter 35) centralize a substantial fraction of what frontend applications previously scattered through components — analytics calls, audit records, observability traces, error tracking, feature-flag exposures. The centralization is one of the architecture’s main wins. The architecture is also a serious responsibility: if every component sends arbitrary payloads to analytics, centralization hasn’t helped; it’s just moved the problem.

This chapter is about the architectural discipline that makes the Kit-style capability layer safer than the inline alternative. The chapter complements the platform-level security work from Chapter 31 (CSP, SRI, WebAuthn, the cookie deprecation story, the Permissions Policy) with the application-level privacy and data-capture decisions that the architecture has to make explicitly.

The Default: Describe, Don’t Capture

The metadata protocol distinguishes describing what happened from capturing what was submitted. The distinction is the architectural foundation of the privacy story.

This is safe:

<button meta-event="profile.saved">Save</button>

The attribute names the event semantically. The string profile.saved describes the action without revealing any user-specific value. Anyone reading the HTML (the user, the browser’s developer tools, any third-party script that might be on the page, any DOM snapshot tool, any analytics platform) sees the event name and learns something happened that the application calls profile.saved. They don’t learn the user’s identity, the value the user submitted, or anything sensitive.

This is risky:

<input meta-prop-email="person@example.com">

The attribute puts the user’s email value into the HTML. Anyone reading the HTML sees the email. The visibility surface is enormous — third-party scripts, browser extensions, accessibility tools, dev-tools recordings, screen-recording software, analytics platforms that take DOM snapshots, debugging overlays. The metadata protocol has become a data-leakage surface.

The architectural rule: metadata describes meaning; payload capture is a separate, explicit decision. The Kit metadata-boundary listener captures the metadata attributes (which describe meaning). Form data is captured via FormData when explicitly handling a form’s submit. User-supplied values that aren’t form fields are captured only when modules explicitly read them.

By default, the analytics module sees:

The event type (profile.saved).
The boundary context (surface, feature, entity-type, entity-id).
Any explicit meta-prop-* attributes the developer placed on the element.

It does not see:

The values inside form fields (unless the form’s submit was the trigger, in which case FormData is included).
The user’s input from any other source.
Any data from anywhere except what the markup explicitly exposed.

This is the opt-in model for user data. The application has to deliberately include user-specific values; nothing leaks accidentally.

Privacy Boundaries

Some regions of the UI are sensitive enough to deserve explicit handling. Payment forms. Authentication forms. Profile fields that include the user’s real name or address. Conversations with medical or legal context. Any region where the application’s analytics, audit, or observability shouldn’t be capturing the full event details.

A private boundary marks a region the application’s capability modules should treat differently:

<kit-boundary surface="payment-form" private>
  <kit-text-field name="card-number" autocomplete="cc-number"></kit-text-field>
  <kit-text-field name="card-cvc" autocomplete="cc-csc"></kit-text-field>
  <kit-button type="submit">Pay</kit-button>
</kit-boundary>

The private attribute is a marker. The metadata-boundary’s context-walk algorithm includes it in the collected context (so modules can see it). Each module is responsible for honoring it.

The analytics module:

const analyticsModule = defineKitModule({
  name: 'analytics',
  events: {
    '*': (event) => {
      if (event.context?.private) {
        // For private boundaries, send only the event type, not the payload
        track(event.type, {
          surface: event.context.surface,
          feature: event.context.feature
          // No entity, no payload
        })
      } else {
        track(event.type, {
          surface: event.context.surface,
          feature: event.context.feature,
          entity: event.context.entity,
          props: filterProps(event.payload)
        })
      }
    }
  }
})

The audit module might have different rules — audit logs are typically the place that records sensitive actions (the user made a payment, the user changed their password), so the audit module captures more, with stricter access controls on the audit data.

The observability module typically redacts entirely inside private boundaries (the error message might include user-supplied content, which shouldn’t reach a third-party error-tracking service uninspected).

The pattern: one boundary attribute, many modules with their own policies. The boundary doesn’t dictate the policy; it provides the signal. The modules implement the policy.

Separate Analytics and Audit

A frequent design error is conflating analytics and audit. They’re different concerns and shouldn’t share infrastructure.

Analytics is for product behavior. How many users clicked the checkout button last week? Which features are growing? Where do users drop off in the signup flow? The data is aggregated, anonymized when possible, retained for typical analytics periods (3-24 months), and analyzed for product decisions. The acceptable error rate is high — losing some events to network failures is fine.

Audit is for accountability. Which administrator made this change? When? From which IP? With what role? The data is per-user, retained for compliance periods (often 7 years in regulated industries), and analyzed only when investigating specific events. The acceptable error rate is much lower — losing audit records may be a compliance violation.

Putting both through the same pipeline produces a system that’s bad at both. Analytics data with audit-level retention is expensive. Audit data with analytics-level reliability misses records. Privacy regulations treat them differently (GDPR’s right to be forgotten applies to analytics; audit logs may be required to be retained despite the right).

The Kit architecture supports the separation directly. Analytics is one module. Audit is another. Each one observes the events it cares about, with its own retention rules, its own redaction policy, its own delivery infrastructure:

const auditModule = defineKitModule({
  name: 'audit',
  events: {
    // Only specific event types — not wildcards
    'profile.saved': (event) => recordAudit('profile.update', event),
    'profile.deleted': (event) => recordAudit('profile.delete', event),
    'admin.user_impersonated': (event) => recordAudit('admin.impersonate', event),
    'admin.role_changed': (event) => recordAudit('admin.role_change', event),
    'payment.completed': (event) => recordAudit('payment', event),
    // ... only audit-worthy events
  }
})

function recordAudit(action: string, event: AppEvent) {
  // Send to the audit log infrastructure (typically a different
  // backend service than analytics, with different retention,
  // access controls, and reliability guarantees).
  fetch('/api/audit', {
    method: 'POST',
    body: JSON.stringify({
      action,
      actor: providers.user.current()?.id,
      target: event.context?.entity,
      surface: event.context?.surface,
      timestamp: Date.now()
      // No raw event payload — only structured audit fields
    })
  }).catch((err) => {
    // Audit failures should be visible — they may be compliance-relevant
    runtime.emit({
      type: 'audit.delivery_failed',
      payload: { action, error: err.message }
    })
  })
}

The audit module’s subscriber list is explicit (no wildcards). The fields it sends are structured. The error handling is more careful than analytics’ (a failed audit delivery is a thing to notice, not silently drop). The infrastructure is separate.

This is the architectural payoff of capability-as-module. Each capability gets its own policy. The architecture enforces the separation by design, not by convention.

Sensitive-Data Redaction

For applications that handle sensitive data — financial, medical, legal — the redaction policy needs to be explicit and enforceable.

A redaction policy is a set of rules about what data may not flow through specific modules. The policy can be applied in several places:

At the source. The component or markup doesn’t expose the sensitive data. Form fields with autocomplete="cc-number" are clearly payment data; the architecture knows not to read their values into events. Fields marked with a custom data-private attribute are similarly protected.

At the boundary. The private boundary attribute (above) marks regions where modules should redact.

At the module. The analytics module has a list of field names that always get redacted. The observability module’s error capture strips known-sensitive patterns from error messages.

At the delivery. The analytics provider has its own redaction rules; the audit service has different ones. The architecture can rely on these as a backstop.

For Kit, the recommended pattern is defense in depth — apply redaction at multiple levels, so that a single mistake doesn’t leak data:

const REDACTED_KEYS = new Set([
  'password', 'cardNumber', 'cvc', 'ssn', 'taxId',
  'authToken', 'sessionId', 'apiKey'
])

function filterProps(payload: any): any {
  if (!payload || typeof payload !== 'object') return payload
  const filtered: Record<string, any> = {}
  for (const [key, value] of Object.entries(payload)) {
    if (REDACTED_KEYS.has(key)) {
      filtered[key] = '[REDACTED]'
    } else if (typeof value === 'object' && value !== null) {
      filtered[key] = filterProps(value)
    } else {
      filtered[key] = value
    }
  }
  return filtered
}

The filter runs on every payload before it reaches an analytics call. The redaction is centralized (one place to update when a new sensitive field name is added). The pattern is testable in isolation.

A complete privacy architecture has to handle the legal frameworks that govern user data.

The General Data Protection Regulation (GDPR, EU, 2018) gives users rights over their data — the right to know what’s collected, the right to access it, the right to correct it, the right to be forgotten, the right to data portability. Applications subject to GDPR have to implement these rights operationally — which usually means the architecture can identify what data belongs to which user and can delete or export it on request.

The California Consumer Privacy Act (CCPA, 2020) and its successor CPRA (2023) impose similar requirements in California, with some differences in scope and enforcement.

Other regional regulations (Brazil’s LGPD, Canada’s PIPEDA, India’s DPDP Act, China’s PIPL) impose varying rules. For most teams shipping global products, GDPR’s framework is the most stringent and serves as a useful baseline.

The Kit architecture supports the implementation directly through capability modules:

Consent management. A consent module exposes the current user’s consent state to other modules. Analytics modules check the consent state before sending data; if the user hasn’t consented to analytics, the module doesn’t deliver.

Right to access. A data-export module collects everything the application knows about the user (from storage, from server APIs, from audit logs) and produces a portable export.

Right to be forgotten. A user-deletion module clears local storage, sends a delete request to the server, and emits a user.deleted event that other modules respond to by clearing their own caches.

Each of these is a module in the architecture. The architecture doesn’t force a specific implementation of GDPR compliance, but the pattern of one capability, one module makes the compliance work tractable. The team can implement each capability once, test it once, and audit it once.

The Permissions Policy and Other Server-Side Controls

Chapter 31 introduced the Permissions Policy HTTP header. It’s worth naming here as part of the production privacy story.

A strict Permissions Policy on every page reduces the application’s attack surface and removes capabilities the application doesn’t need:

Permissions-Policy:
  geolocation=(),
  camera=(),
  microphone=(),
  payment=(self),
  fullscreen=(self),
  display-capture=()

The header tells the browser: this page doesn’t use geolocation, camera, or microphone, so even if a script tries, the browser should refuse. The privacy-sensitive capabilities (payment, fullscreen) are allowed only on the page’s own origin.

For Kit applications, the Permissions Policy is set at the server level (in the HTTP response). The architecture doesn’t enforce it; the platform does. The architecture’s job is to make sure the application doesn’t need capabilities it shouldn’t have, so the Permissions Policy can be strict.

A Content Security Policy (CSP, also Chapter 31) similarly restricts what scripts the page can load. For Kit applications, a strict CSP catalogs the application’s actual sources:

Content-Security-Policy:
  default-src 'self';
  script-src 'self' https://cdn.example.com 'nonce-{random}';
  style-src 'self' 'unsafe-inline';
  connect-src 'self' https://api.example.com https://realtime.example.com;
  img-src 'self' data: https:;
  frame-ancestors 'none';

The CSP is application-specific. The architecture doesn’t dictate it. The architecture’s small dependency surface (a runtime, Lit, the application’s modules) makes a strict CSP easier to maintain than for applications with many third-party scripts.

Production Observability

A specific operational concern worth naming: production observability of the architecture itself.

The runtime’s diagnostic stream (Chapter 44) captures every event, command, and module operation. In production, sending all of that to an observability platform is wasteful and privacy-risky. The pattern is to:

Sample the stream (record one in every N events).
Capture only specific event types (errors, slow handlers, security-relevant events).
Forward sanitized records to the observability platform (Honeycomb, Datadog, Sentry, the team’s choice).

A small observability module handles this:

const observabilityModule = defineKitModule({
  name: 'observability',
  onInstall: ({ runtime }) => {
    runtime.onDiagnostic((entry) => {
      // Sample: record 5% of normal events, 100% of errors
      const isError = entry.kind.endsWith('-error')
      if (!isError && Math.random() > 0.05) return

      // Sanitize: strip sensitive fields
      const sanitized = sanitize(entry)

      // Forward to observability service
      observability.record(sanitized)
    })

    runtime.on('error.application', (event) => {
      observability.captureError(event.payload, {
        surface: event.context?.surface,
        feature: event.context?.feature
      })
    })
  }
})

The module gives the architecture a production observability surface that’s predictable, sampled, and sanitized. The team has access to error rates, slow handlers, deployment-related regressions, all without exposing user data to the observability provider.

Bridge to Migration

The next chapter (Chapter 70) is the migration story — how a team with an existing React (or Vue, or framework-bound) application moves toward the Kit architecture without rewriting everything at once. The migration patterns build on the production concerns this chapter introduced: the security and privacy disciplines have to be maintained throughout the migration, not as an afterthought.

Exercise: Audit Your Capture Surface

Take a Kit application you’ve been building (or any application using the architecture). Audit what each capability module captures.

Inventory the events. List every meta-event and meta-command declared in the application’s markup. For each one, document what it represents semantically.

Inventory the captures. For each module subscribed to events, document what fields the module sends to its target service. The analytics module — what does it send to the analytics provider? The audit module — what does it record? The observability module — what does it forward to the error-tracking service?

Identify private boundaries. Find regions of the application that should be private (payment forms, authentication forms, sensitive profile fields, message content). Verify they’re marked with the private attribute.

Check the modules’ policies. For each capability module, verify it honors the private flag. Test that an event from a private boundary doesn’t leak payload data to the module’s target.

Test redaction. Find a sensitive field (password, card number, SSN). Trigger an event that includes the field’s value in the payload. Verify the redaction filter catches it.

Verify consent handling. If your application has a consent flow, verify that analytics modules don’t send data when consent hasn’t been granted.

Check the CSP and Permissions Policy. Inspect the HTTP response headers. Verify the CSP is strict (no unsafe-inline for scripts, no unsafe-eval). Verify the Permissions Policy disables capabilities the application doesn’t use.

Reflect on:

What was being captured that you didn’t expect? (Often: more than the team realized.)
What sensitive data could leak if a single guard failed? (Often: the user’s email, their input values, their identity tokens.)
How would you respond to a GDPR data-subject request today? How long would it take?
If a user asked to be deleted, what would the modules need to do?

The audit is the architecture’s accountability surface. Doing it once catches issues that would have shipped to production. Doing it regularly catches drift as the application evolves. The Kit architecture makes the audit tractable because the capture surface is centralized — the audit is per-module, not per-component.