What Makes Documentation AI-Ready: Structure

It’s that time again when I refresh my API Documentation course for the next term. In the course description, I emphasize that I’m teaching foundation skills to write good API documentation, not the latest tools. Students need to use tools to produce documentation, so I teach how to learn new tools while teaching those foundation skills. They learn GitHub as the representative example—experiencing both a documentation tool and how foundation skills look when instantiated as actual documentation.

After adding AI tools to the course last year, each refresh has taken more time. AI technology advances faster than I can teach the course, creating real tension: should I keep focusing on fundamentals, or chase the latest AI capabilities?

A recent Write the Docs discussion gave me the answer I needed. The conversation started with an article promoting DITA as the solution to produce AI-readable documentation. It sparked debate about tools and approaches and revealed something more fundamental: AI tools don’t need anything new to read documentation. They need the same structural principles that information scientists and technical writers have relied on for decades.

The fundamentals aren’t outdated. They’re just being rediscovered by each new technology.

What the discussion revealed

An article titled Why AI Search Needs Intent (and Why DITA XML Makes It Possible) kicked off the Write the Docs discussion by promoting DITA (Darwin Information Typing Architecture) as the solution for making documentation AI-readable. The article’s premise is solid: users often can’t articulate what they need, and structured content helps guide them to answers.

Our discussion consensus: structure is definitely important, but we had less certainty about DITA as a requirement to provide that structure. DITA does enforce topic structure and information architecture, but it requires significant overhead in authoring tools, training, workflow changes. I’ve used DITA-like tagging at several organizations. It’s not trivial. Other approaches can achieve similar results: a Markdown-based system with consistent templates and frameworks like Diataxis, for example.

The real insight emerged as we debated tools: organized content matters more than the specific technology you use to create it. Human users struggle to articulate intent; AI tools need context to organize content. Both need clear organization to guide them to answers. What matters is the discipline to maintain clear hierarchy and purposeful organization, not the specific technology.

This isn’t a new discovery. It’s information architecture and findability principles that predate the web.

What AI prioritizes

To understand what “clear hierarchy and purposeful organization” means in practice, I asked Claude how it processes web content. The answer confirms that AI tools rely on the same structural elements that have made content findable for decades.

Claude listed the following documentation elements in descending order of importance when it processes documentation:

Most Important

  • Headings (h1-h6) – Primary structure signal; tells me content hierarchy and topic organization
  • Main content containers (<main>, <article>) – Distinguishes core content from navigation/ads/boilerplate
  • Paragraph text – The actual information I’m extracting
  • Page title (<title>) – Often the clearest statement of page purpose

Very Helpful

  • Semantic sectioning (<section>, <article>, <aside>) – Context about content relationships
  • Lists (<ul>, <ol>, <dl>) – Structured information that’s easy to parse
  • Link text and href attributes – Understanding relationships and navigation
  • Tables with proper headers – Tabular data structure
  • Meta description – Summary confirmation/context

Moderately Helpful

  • Structured data (JSON-LD, schema.org) – Explicit metadata about entities, events, products
  • Alt text – Image context when it adds information
  • Strong/em tags – Emphasis signals (though less reliable than structure)
  • Figure/figcaption – Image-text relationships
  • Time elements – Temporal context

Minimally Helpful

  • Navigation landmarks (<nav>) – Helps identify non-content areas
  • Footer/header elements – Typically redundant or boilerplate
  • ARIA labels – Useful when semantic HTML is insufficient
  • Blockquotes – Indicates quoted material
  • Code blocks – Preserves formatting context

Noise/Problematic

  • Excessive div/span soup – Makes structure harder to parse
  • Hidden content (display:none) – May or may not be included
  • Inline styles – Invisible to me
  • Empty or decorative elements – Clutter

What this list reveals

Every priority on this list maps to established information architecture principles:

  • Heading hierarchy – Content organization and scanability (documented since at least the 1980s)
  • Semantic structure – Separating content from presentation (web standards foundation)
  • Clear metadata – Findability and classification (library science fundamentals)
  • Accessible markup – Universal design principles (formalized in the 1990s)

AI tools aren’t introducing new requirements. They’re restating what information professionals have known for decades: structure matters, semantics matter, organization matters. Visual styling alone has never been enough for findability.

Claude noted that the gap between “Most important” and “Very helpful” is larger than it appears. This list focuses on structural elements. Quality writing remains fundamental to documentation success, but that’s a topic for another article.”

Why this matters for practice

This perspective resolves the tension I face with course updates. I don’t need to teach the latest AI features. I need to teach the structural principles that make content findable regardless of whether it’s being read by a human with a screen reader, parsed by a search engine, or processed by an AI tool.

Your templates and production systems need to deliver your high-quality writing with:

  • A clear heading hierarchy
  • Semantic HTML markup
  • Purposeful organization
  • An accessible structure

Pick the system that works best for your situation—DITA-based, Markdown-based, or something else entirely. You’ll know it’s working when it enforces these structural principles consistently. The specific technology matters less than the discipline to maintain the fundamentals.

For my students: master the principles of clear hierarchy, semantic structure, and purposeful organization. These transcend any specific tool. The AI systems you’ll use five years from now will rely on the same foundations, even if the implementation looks completely different.

For practitioners: if you’re panicking about making your documentation “AI-ready,” check whether you’re already following established information architecture principles. If you are, you’re already there. If you’re not, fixing that will serve both your human and AI readers—and will continue serving whatever comes next.

As AI disrupts technical communication, the fundamentals remain unchanged. The principles that guided technical writers fifty years ago still apply. Each new technology just gives us another opportunity to rediscover why they matter.

Further Reading

For readers interested in the historical foundations of these principles:

Information Architecture: Rosenfeld, L., & Morville, P. (1998). Information Architecture for the World Wide Web. O’Reilly Media. The foundational text establishing information architecture as a discipline for organizing web content.

Document Design and Hierarchy: Schriver, K. (1997). Dynamics in Document Design: Creating Text for Readers. Wiley. Research-based principles for content structure and scanability, with an 800-item bibliography documenting the evolution of document design theory.

Web Accessibility: W3C Web Content Accessibility Guidelines (WCAG). Current standards for accessible web content, building on universal design principles formalized in the 1990s.

Metadata and Findability: Dublin Core Metadata Initiative. Standardized metadata vocabulary developed in 1995 in response to the early web’s findability crisis.

For a comprehensive bibliography tracing these principles from the 1980s through current practice, see my Bibliography of Web Design Principles.

Leave a Reply