Chapter 20: HTML Is an Application Language

By the time most developers encounter modern frontend work, HTML has been reduced to output.

The framework renders it. The component returns it. The template generates it. The build pipeline transforms it. The browser receives it. HTML, in this view, is the last mile of the application — the format the framework happens to produce on its way to the user. A working frontend developer in 2025 may spend years writing components, hooks, and effects without ever consciously choosing an HTML element. The framework picks the elements; the elements are an implementation detail.

This chapter is the part of the book where we have to disagree with that view.

HTML isn’t a serialization format. It’s an application language — a vocabulary of elements that encode structure, intent, controls, navigation, grouping, labeling, and relationships. The browser understands HTML before any JavaScript runs. Assistive technologies understand it. Search engines understand it. CSS understands it. Forms understand it. The accessibility tree understands it. User agents understand more of HTML than most application code gives them credit for.

If we’re asking what modern frontend should look like when built for the browser we have now, we have to begin by recovering HTML as a serious application layer. Not the format the framework happens to produce. The first language we choose, deliberately, with intention.

Native Elements Are Behaviors

A native element isn’t only a tag. It’s a bundle of meaning and behavior the platform already implements.

A <a> link isn’t a <span> with blue text. It’s navigation, focus, keyboard behavior (Enter activates, Tab moves to next element), context-menu behavior (right-click opens a menu with Copy Link Address, Open in New Tab, Save Link As), middle-click-to-open-in-new-tab behavior, drag-as-link behavior, semantic relationship in the accessibility tree, default styling for visited and unvisited and hovered states, and a contract with the browser’s history and addressbar that lets the user share the destination as a URL.

A <button> isn’t a styled rectangle. It’s activation (click, touch, Enter, Space), keyboard handling, disabled state, form association (a <button> inside a <form> submits the form by default unless its type is changed), accessible name calculation, focus ring, hover/active states, and a platform convention that screen readers, voice control software, and other assistive technologies all recognize.

A <label> isn’t decorative text. It establishes an explicit relationship with a form control. Clicking the label focuses or activates the associated control. The label’s text becomes the accessible name of the control. The label increases the hit area for users with motor difficulties. Screen readers announce the label when the user moves focus to the control.

A <form> isn’t a <div> around inputs. It’s a transaction boundary. Submitting the form triggers a default action (a POST or GET request) that the browser handles natively. The form participates in browser history. Validation can be applied at the form level. Field values can be serialized as FormData. The form’s lifecycle is something the browser knows how to manage.

A <dialog> isn’t a fixed-position panel. It’s a top-layer element with modality, focus management (focus is automatically trapped inside the dialog when modal), a backdrop, browser-managed Escape-to-close behavior, and accessibility-tree integration that announces the dialog as a dialog to screen readers.

HTML is full of these small pieces of application architecture. When we use them, the platform participates. When we replace them with generic elements, we have to rebuild what we discarded.

This is why the first rule of web-native architecture isn’t avoid JavaScript. It’s don’t throw away semantics by default.

The Div Is Not Neutral

A <div> is useful because it means almost nothing. It groups content. It gives authors a box. It can be styled, measured, moved, and scripted.

That neutrality is also the problem.

When a <div> is used as a button, the application has hidden meaning from the browser. The developer may know what the element is supposed to be. The CSS may make it look like a button. The click handler may make it respond to a mouse. The platform doesn’t automatically know any of this. A <div>-button needs a role="button". It needs tabindex="0" to be focusable. It needs explicit keyboard handling for Enter and Space. It needs aria-disabled and behavior to match. It needs an accessible name, often supplied by aria-label. It needs focus styling that matches the rest of the application. It needs hover and active states. If it lives inside a form, it needs to know whether to submit the form or not. It needs to announce itself correctly to screen readers, voice-control software, and any other assistive technology that might encounter it.

That’s a lot of work to rebuild something the browser already has.

The same pattern repeats with custom dropdowns (rebuilding <select>), custom modals (rebuilding <dialog>), custom tabs (rebuilding role-and-aria patterns the platform handles for tablist/tab/tabpanel), custom checkboxes (rebuilding <input type="checkbox">), custom radio groups (rebuilding <input type="radio"> and the implicit name-based group behavior), custom date pickers (rebuilding <input type="date">), and a long list of others.

Sometimes custom behavior is genuinely necessary. A native <select> couldn’t be styled to match a specific design until recently — the new customizable <select> work, which we’ll come back to in a moment, finally changes that. A native <dialog> didn’t have good focus-trap behavior until 2022. The decision to replace a native element was usually defensible in its moment.

The pattern is that the moment passes. The platform catches up. The custom element remains, often forever, because nobody refactors a working component back to the native primitive once the framework wrapper exists. The accessibility debt accumulates. Future developers inherit it without context. The original justification has been forgotten. The wrapper just is what a button looks like in this codebase.

Replacing a native element should require active justification, not a default. Why isn’t this a <button>? should be a question the team is prepared to answer.

The Accessibility Tree

A specific thing happens when HTML is parsed, and it’s worth describing carefully because it explains a lot of the rest of this chapter.

When the browser parses HTML, it builds two trees. The first is the DOM tree — the data structure we manipulate from JavaScript, the structure CSS selectors target, the thing visible in browser dev tools. The second is the accessibility tree — a parallel structure that represents the document in terms a screen reader, voice-control software, or other assistive technology can use. The accessibility tree contains accessibility nodes, each with a role (button, link, heading, form, dialog, region, etc.), an accessible name, an accessible description, and a set of states (focused, expanded, disabled, pressed, checked).

The accessibility tree is computed automatically from the HTML. A <button> becomes an accessibility node with the role button. A <a href="..."> becomes a node with the role link. A <h1> becomes a node with the role heading and a level of 1. A <label for="x"> is associated with the form control it labels, and the control’s accessibility node gets the label’s text as its accessible name.

When you replace a <button> with a <div>, the accessibility tree doesn’t get a button anymore. It gets a generic node with the role generic and no behavior. Screen reader users encountering the application hear nothing useful about that element — not its purpose, not its activation method, not its state. The element exists visually, but in the accessibility tree it’s empty.

This is why the accessibility tree is a real consequence of markup choices line at the top of this section matters. Markup isn’t styled output. It’s the data that determines what a substantial fraction of your users perceive when they use your application.

Marcy Sutton, Léonie Watson, Hidde de Vries, and a generation of accessibility advocates have spent the past decade and a half making this argument publicly, repeatedly, with patience that most of us couldn’t sustain. The work is mostly thankless. The legal and ethical case for it is overwhelming, the technical case for it is well-documented, and the field still routinely produces inaccessible applications because the inherited defaults push the wrong direction. The accessibility-as-platform-contract chapter (Ch 29) will return to this with more weight. The point for now is that the choice between <button> and <div role="button"> isn’t an academic preference. It changes what your application is, for a non-trivial fraction of the people who use it.

ARIA: The First Rule

When semantic HTML can’t express what we need, ARIA (Accessible Rich Internet Applications) provides additional attributes that supplement the accessibility tree.

role="dialog". aria-label="Close". aria-expanded="true". aria-current="page". aria-live="polite". The vocabulary is significant — ARIA defines roles, states, and properties for most of the interactive patterns that don’t have direct HTML elements.

The W3C ARIA Authoring Practices includes a rule that’s been cited so often it’s become a slogan among accessibility engineers. The first rule of ARIA is don’t use ARIA. The full version is something like No ARIA is better than bad ARIA, and the elaboration is that adding ARIA attributes to a <div> to make it act like a <button> is almost always worse than using a <button> in the first place, because every ARIA pattern has subtle requirements that custom implementations routinely miss.

A role="button" on a <div> tells the accessibility tree it’s a button. The <div> still doesn’t respond to Enter or Space. The screen reader user hears button and then activates the element with the keyboard, and nothing happens. The custom implementation has to add keyboard handlers manually. It has to add focus styling manually. It has to handle aria-disabled manually. Every one of these manual steps is a place where the implementation can go wrong, and the testing surface for all the assistive technologies that might encounter this element is enormous.

ARIA is essential for the cases where it’s needed — for aria-live announcements, for aria-controls relationships, for the role="tablist" pattern that has no native equivalent, for the aria-current and aria-expanded states that decorate native elements with extra meaning. When you need it, use it. When you can use a native element instead and get the same behavior for free, prefer the native element.

The deeper point is that ARIA is a fallback, not a primary tool. The primary tool is HTML.

Landmark Elements and the Document Outline

A modern HTML document has a small set of landmark elements — <header>, <nav>, <main>, <aside>, <footer>, and <section> with an accessible name — that organize the page into navigable regions.

Screen readers use these landmarks for navigation. A user pressing the keyboard shortcut for next landmark moves between the major sections of the page in a way the user can predict. A user who has visited similar sites before can find the main content area, the navigation, and the footer without scrolling through the whole document.

Most production web applications don’t use landmark elements consistently. The typical pattern is <div class="header">, <div class="navigation">, <div class="content">, <div class="sidebar">, <div class="footer">. Each of those <div>s is invisible to landmark navigation. The page works visually for sighted users and is much less navigable for everyone else.

Switching to <header>, <nav>, <main>, <aside>, <footer> is a one-line-per-element change that provides immediate, measurable accessibility improvements. The change costs nothing. The improvement is real. Most teams haven’t done it for the same reason most teams haven’t refactored their <div>-button wrappers — habit outliving constraint.

The <h1> through <h6> heading elements similarly contribute to the document outline. A well-structured document has a single <h1> that names the page, <h2> elements that name the major sections, <h3> elements that name subsections within those, and so on. Screen reader users can navigate by heading. Search engines weight headings as significant content. The visual styling of the heading is a separate concern that CSS handles, but the semantic level matters.

The pattern is consistent. Native elements provide structure. The platform exposes that structure through the accessibility tree. Assistive technologies use the structure to make the application navigable. When the structure is missing, the application becomes a single opaque block.

Form-Associated Elements

The form-associated elements are a separate category worth naming.

A <form> is, as established earlier, a transaction boundary. The elements that participate in forms — <input>, <select>, <textarea>, <button>, <fieldset>, <legend>, <output>, <label> — each have specific roles in the transaction. They contribute their values to FormData when the form is submitted. They participate in the constraint validation API (required, pattern, min, max, step). They expose accessible names through their labels. They participate in form-level submission events.

A custom <div>-based form control sits outside this system entirely. The custom field doesn’t contribute its value to FormData. It doesn’t participate in constraint validation. It doesn’t fire the standard form events. The form-level submit won’t include the custom field unless the application has added separate machinery to make it do so.

There’s a third option, introduced in 2020 and reaching general support around 2022: form-associated custom elements. A custom element can declare itself form-associated using ElementInternals, and the browser will include it in the form’s FormData, run constraint validation on it, and treat it like a native form control. This is the platform-blessed path to a custom field that still participates in form semantics. The Kit components built in Part V use this pattern extensively.

The chapter on forms (Ch 24) goes deeper. The point here is that forms have semantics — they exist as application-level constructs that the browser understands — and replacing them with custom DOM is a much bigger architectural choice than most teams treat it as.

The Modern Native Controls

A handful of native elements have shipped or matured recently and are worth naming because most teams haven’t caught up to them yet.

<dialog> shipped with reliable focus management around 2022. The element provides modal and non-modal dialog behavior, top-layer rendering, automatic Escape-to-close, focus trap, and accessibility-tree integration. Most modal libraries written before 2022 still exist in codebases that could be using <dialog> directly.

The Popover API shipped in 2023–2024. popover="auto" on any element makes it behave as a popover with light-dismiss behavior, top-layer rendering, and accessibility integration. Tooltips, menus, and similar non-modal overlays no longer need the kind of custom positioning and dismissal logic libraries used to provide.

<details> and <summary> provide a native disclosure widget — an expandable region with a clickable summary. Keyboard navigation works. The state is reflected in the DOM (<details open>). Screen readers announce it correctly. Most accordion implementations could be <details> elements with custom styling.

The new customizable <select> element is matur ing as of 2024–2025. The Interop 2024 work made the long-promised fully stylable native select finally real. A native <select> with appearance: base-select can be styled to match any design, while preserving the keyboard behavior, accessibility, mobile-native picker on touch devices, and form integration that custom-built dropdowns have to recreate.

<input type="..."> continues to grow. type="email", type="tel", type="url", type="date", type="time", type="color", type="range", type="search", type="file" — each of these provides input behavior, mobile keyboard adaptation, native validation, and platform conventions that custom field implementations routinely miss.

The pattern: when in doubt, check whether the platform now has a native element for what you’re about to build. The answer in 2025 is more often yes than the answer in 2018 would have been.

Custom Elements as Platform-Blessed Extension

The platform isn’t only the built-in elements. It’s also a mechanism for extending the set of elements.

Custom elements (the standardized form, available reliably across browsers since 2018) let an application define new HTML elements that the browser treats as real elements. A <kit-button> defined as a custom element is, to the platform, an element. It can be queried with document.querySelector('kit-button'). It fires real DOM events. CSS can target it. It can be form-associated. Its lifecycle is managed by standard custom-element callbacks (connectedCallback, disconnectedCallback, attributeChangedCallback).

This is the platform’s blessed answer to we need a component that doesn’t have a direct native equivalent. Instead of building the component with <div>s and recreating native semantics from scratch, define it as a custom element, give it the right roles and behavior, and let the platform treat it as a first-class element. Lit, introduced more rigorously in Part V, is a thin authoring layer that makes writing custom elements pleasant without taking ownership of the application.

The chapter doesn’t need to do more than name this here. Part V is where the custom-elements story gets unpacked. The point for now is that extending HTML is part of the platform’s design. The dichotomy use native HTML or build everything custom in JavaScript is a false one. The third option — extend HTML with new elements that participate in the platform’s contracts — is the option the rest of this book builds on.

What HTML Teaches Modern Frontend

HTML teaches restraint.

Before creating a component, ask what native element it’s wrapping. Before creating a control, ask what behavior the browser already provides. Before adding ARIA, ask whether the correct native element would already express the semantics. Before managing state in JavaScript, ask whether the element already has state (<details open>, <input checked>, <dialog open>, <form> validity). Before building a custom transaction flow, ask whether a form already describes it.

This doesn’t make HTML sufficient for every application. Most non-trivial applications need JavaScript, frameworks, components, and abstractions on top of HTML. The book argues for those abstractions throughout the rest of Part II and into Parts III–VI. The argument is that the abstractions should build on HTML’s semantics, not replace them.

Modern frontend shouldn’t treat semantic HTML as beginner material. It’s one of the most leveraged tools the field has, because it transfers responsibility back to the platform. A <button> that’s actually a <button> carries decades of accumulated browser-side engineering for free. A <form> that’s actually a <form> participates in browser navigation, browser autofill, browser history, password manager integration, screen reader recognition, and a list of platform behaviors that no application’s custom code can match.

The platform did the work. The architecture’s job is to honor that work by using it.

What Comes Next

This chapter introduced HTML as an application language. The next chapter takes the same view of the DOM — the tree structure that HTML produces in memory — and argues that the DOM is a context tree, not just a render target. The chapters after that take attributes as a protocol surface, events as semantic communication, forms as transactions, CSS as a runtime, and accessibility as the platform’s foundational contract.

Each of these chapters makes a similar architectural move. This thing you’ve been treating as an implementation detail is actually load-bearing. Treat it that way.

Exercise: Refactor Div UI Into HTML

Start with a div-heavy interface:

<div class="page">
  <div class="title">Profile</div>
  <div class="field">
    <div class="label">Display name</div>
    <div class="input" contenteditable></div>
  </div>
  <div class="button">Save</div>
</div>

Refactor it using semantic HTML:

<main>
  <h1>Profile</h1>
  <form>
    <label for="display-name">Display name</label>
    <input id="display-name" name="displayName" required>
    <button type="submit">Save</button>
  </form>
</main>

Then compare:

Keyboard behavior. Can you Tab through the form? Does Enter submit it?
Accessible names. Open browser dev tools, find the accessibility panel, and look at the accessibility tree for both versions. What do screen readers see?
Focus order. Where does focus start? Where does it go on Tab? Where on Shift+Tab?
Form submission. Try submitting the form. Does the browser handle it automatically? What’s the request that goes to the server?
Required validation. Try submitting with an empty field. Does the browser block submission and explain why?
Amount of JavaScript needed. The div version needs scripts to do everything. The semantic version needs none until you want behavior the form-default-submit doesn’t cover. Count the lines.
CSS hooks. Are the styling targets you need still available in the semantic version? (They are.)

The goal is to experience how much application behavior appears when HTML is allowed to do its job. The semantic version, with zero JavaScript, is closer to a working profile form than the div version with hundreds of lines of script. The platform was doing the work the whole time.