What Makes Documentation AI-Ready: Structure

It’s that time again when I refresh my API Documentation course for the next term. In the course description, I emphasize that I’m teaching foundation skills to write good API documentation, not the latest tools. Students need to use tools to produce documentation, so I teach how to learn new tools while teaching those foundation skills. They learn GitHub as the representative example—experiencing both a documentation tool and how foundation skills look when instantiated as actual documentation.

After adding AI tools to the course last year, each refresh has taken more time. AI technology advances faster than I can teach the course, creating real tension: should I keep focusing on fundamentals, or chase the latest AI capabilities?

A recent Write the Docs discussion gave me the answer I needed. The conversation started with an article promoting DITA as the solution to produce AI-readable documentation. It sparked debate about tools and approaches and revealed something more fundamental: AI tools don’t need anything new to read documentation. They need the same structural principles that information scientists and technical writers have relied on for decades.

The fundamentals aren’t outdated. They’re just being rediscovered by each new technology.

What the discussion revealed

An article titled Why AI Search Needs Intent (and Why DITA XML Makes It Possible) kicked off the Write the Docs discussion by promoting DITA (Darwin Information Typing Architecture) as the solution for making documentation AI-readable. The article’s premise is solid: users often can’t articulate what they need, and structured content helps guide them to answers.

Our discussion consensus: structure is definitely important, but we had less certainty about DITA as a requirement to provide that structure. DITA does enforce topic structure and information architecture, but it requires significant overhead in authoring tools, training, workflow changes. I’ve used DITA-like tagging at several organizations. It’s not trivial. Other approaches can achieve similar results: a Markdown-based system with consistent templates and frameworks like Diataxis, for example.

The real insight emerged as we debated tools: organized content matters more than the specific technology you use to create it. Human users struggle to articulate intent; AI tools need context to organize content. Both need clear organization to guide them to answers. What matters is the discipline to maintain clear hierarchy and purposeful organization, not the specific technology.

This isn’t a new discovery. It’s information architecture and findability principles that predate the web.

What AI prioritizes

To understand what “clear hierarchy and purposeful organization” means in practice, I asked Claude how it processes web content. The answer confirms that AI tools rely on the same structural elements that have made content findable for decades.

Claude listed the following documentation elements in descending order of importance when it processes documentation:

Most Important

Headings (h1-h6) – Primary structure signal; tells me content hierarchy and topic organization
Main content containers (<main>, <article>) – Distinguishes core content from navigation/ads/boilerplate
Paragraph text – The actual information I’m extracting
Page title (<title>) – Often the clearest statement of page purpose

Very Helpful

Semantic sectioning (<section>, <article>, <aside>) – Context about content relationships
Lists (<ul>, <ol>, <dl>) – Structured information that’s easy to parse
Link text and href attributes – Understanding relationships and navigation
Tables with proper headers – Tabular data structure
Meta description – Summary confirmation/context

Moderately Helpful

Structured data (JSON-LD, schema.org) – Explicit metadata about entities, events, products
Alt text – Image context when it adds information
Strong/em tags – Emphasis signals (though less reliable than structure)
Figure/figcaption – Image-text relationships
Time elements – Temporal context

Minimally Helpful

Navigation landmarks (<nav>) – Helps identify non-content areas
Footer/header elements – Typically redundant or boilerplate
ARIA labels – Useful when semantic HTML is insufficient
Blockquotes – Indicates quoted material
Code blocks – Preserves formatting context

Noise/Problematic

Excessive div/span soup – Makes structure harder to parse
Hidden content (display:none) – May or may not be included
Inline styles – Invisible to me
Empty or decorative elements – Clutter

What this list reveals

Every priority on this list maps to established information architecture principles:

Heading hierarchy – Content organization and scanability (documented since at least the 1980s)
Semantic structure – Separating content from presentation (web standards foundation)
Clear metadata – Findability and classification (library science fundamentals)
Accessible markup – Universal design principles (formalized in the 1990s)

AI tools aren’t introducing new requirements. They’re restating what information professionals have known for decades: structure matters, semantics matter, organization matters. Visual styling alone has never been enough for findability.

Claude noted that the gap between “Most important” and “Very helpful” is larger than it appears. This list focuses on structural elements. Quality writing remains fundamental to documentation success, but that’s a topic for another article.”

Why this matters for practice

This perspective resolves the tension I face with course updates. I don’t need to teach the latest AI features. I need to teach the structural principles that make content findable regardless of whether it’s being read by a human with a screen reader, parsed by a search engine, or processed by an AI tool.

Your templates and production systems need to deliver your high-quality writing with:

A clear heading hierarchy
Semantic HTML markup
Purposeful organization
An accessible structure

Pick the system that works best for your situation—DITA-based, Markdown-based, or something else entirely. You’ll know it’s working when it enforces these structural principles consistently. The specific technology matters less than the discipline to maintain the fundamentals.

For my students: master the principles of clear hierarchy, semantic structure, and purposeful organization. These transcend any specific tool. The AI systems you’ll use five years from now will rely on the same foundations, even if the implementation looks completely different.

For practitioners: if you’re panicking about making your documentation “AI-ready,” check whether you’re already following established information architecture principles. If you are, you’re already there. If you’re not, fixing that will serve both your human and AI readers—and will continue serving whatever comes next.

As AI disrupts technical communication, the fundamentals remain unchanged. The principles that guided technical writers fifty years ago still apply. Each new technology just gives us another opportunity to rediscover why they matter.

What Makes Documentation AI-Ready: Structure

What the discussion revealed

What AI prioritizes

What this list reveals

Why this matters for practice

Further Reading

3 Replies to “What Makes Documentation AI-Ready: Structure”

Leave a Reply