What to measure?

Creative Commons licenseThat measuring API documentation is difficult is one of the things I’ve learned from writing developer docs for more than 11 years. Running the study for my dissertation gave me a detailed insight as to some of the reasons for this.

The first challenge to overcome is answering the question, “What do you want to measure?” A question that is followed immediately by, “…and under what conditions?” Valid and essential, but not simple, questions. Stepping back from that question, and a higher-level question comes into view, “What’s the goal?” …of the topic? …of the content set? and then back to the original question, of the measurement?

For my dissertation, I spent considerable effort scoping the experiment down to something manageable, measurable, and meaningful–ending up at the relevance decision. Clearly there is more to the API documentation experience than just deciding if a topic is relevant, but that’s a pivotal moment in the content experience. The relevance decision also seemed to be the most easily identifiable, discrete event that I could identify in the overall API reference topic experience. It’s a pivotal point in the experience, but by no mean the only one.

The processing model I used was based on the TRACE model presented by Rouet (2006). Similar cognitive-processing models were also identified in other API documentation and software development research papers. In this model, the experiment focuses on step 6.

The Task-based Relevance Assessment and Content Extraction (TRACE Model) of document processing Rouet, J.-F. (2006). The Skills of Document Use: From Text Comprehension to Web-Based Learning (1st ed.). Lawrence Erlbaum Associates.
The Task-based Relevance Assessment and Content Extraction (TRACE Model) of document processing
Rouet, J.-F. (2006). The Skills of Document Use: From Text Comprehension to Web-Based Learning (1st ed.). Lawrence Erlbaum Associates.

Even in this context, my experiment studies a very small part of the overall cognitive processing of a document and an even smaller part of the overall task of information gathering to solve a larger problem or to answer a specific question.

To wrap this up by returning to the original question, that is…what was the question?

  1. The goal of the topic is to provide information that can be easily accessible to the reader.
  2. The easily accessible goal is measured by the time it takes for the reader to identify whether the topic provides the information they seek or not.
  3. The experiment simulates the readers task by providing the test participants with programming scenarios in which to evaluate the topics
  4. The topics being studied are varied randomly to reduce order effects and bias and participants see only one version of the topics to bias their experience by seeing other variations.

In this experiment, other elements of the TRACE model are managed by or excluded from the task.

API reference topic study – thoughts

Last month, I published a summary of my dissertation study and I wanted to summarize some of the thoughts that the study results provoked. My first thought was that my experiment was broken. I had four distinctly different versions of each topic yet saw no significant difference between them in the time participants took to determine the relevance of the topic to the task scenario. Based on all the literature about how people read on the web and the importance of headings and in-page navigation cues in web documents, I expected to see at least some difference. But, no.

The other finding that surprised me was the average length of time that participants spent evaluating the topics. Whether the topic was relevant or not, participants reviewed a topic for an average of about 44 seconds before they decided its relevance. This was interesting for several reasons.

  1. In web time, 44 seconds is an eternity–long enough to read the topic completely, if not several times. Farhad Manjoo wrote a great article about how people read Slate articles online, which agrees with the widely-held notion that people don’t read online. However, API reference topics appear to be different than Slate articles and other web content, which is probably a good thing for both audiences.
  2. The average time spend reading a reference topic to determine its relevance in my study was the same whether the topic was relevant to the scenario or not. I would have expected them to be different–the non-relevant topics taking longer than the relevant ones on the assumption that readers would spend more time looking for an answer. But no. They seemed to take about 44 seconds to decide whether the topic would apply or not in both cases.

While, these findings are interesting, and bear further investigation, they point out the importance of readers’ contexts and tasks when considering page content and design. In this case, changing one aspect of a document’s design can improve one metric (e.g. information details and decision speed) at the cost of degrading others (credibility and appearance).

The challenges then become:

  1. Finding ways to understand the audience and their tasks better to know what’s important to them
  2. Finding ways to measure the success of the content in helping accomplishing those tasks

I’m taking a stab at those in the paper I’ll be presenting at the HCII 2015 conference, next month.

API reference topic study – summary results

During November and December, 2014, I ran a study to test how varying the design and content of an API reference topic influenced participants’ time to decide if the topic was relevant to a scenario.


  • I collected data from 698 individual task scenarios were from 201 participants.
  • The shorter API reference topics were assessed 20% more quickly than the longer ones, but were less credible and were judged to have a less professional appearance than the longer ones.
  • The API reference topics with more design elements were not assessed any more quickly than those with only a few design elements, but the topics with more design elements were more credible and judged to have a more professional appearance.
  • Testing API documentation isn’t that difficult (now that I know how to do it, anyway).

The most unexpected result, based on the literature, was how the variations of visual design did not significantly influence the decision time. Another surprise was how long the average decision time was–almost 44 seconds, overall. That’s more than long enough to read the entire topic. Did they scan or read? Unfortunately, I couldn’t tell from my study.


The experiment measured how quickly participants assessed the relevance of an API reference topic to a task-based programming scenario. Each participant was presented with four task scenarios:  There were two scenarios for each task: one to which the topic applied and another to which the topic was not relevant and each participant saw two of each. There were four variations of each API reference topic; however, each participant only saw one–they had no way to compare one variation to another.

The four variations of API reference topics resulted from two levels of visual design and two levels of the amount of information presented in the topic.

Low visual design High visual design Findings:
Information variations
copy_ld_hi copy_hd_hi
  • Higher credibility
  • More professional appearance
copy_ld_lo copy_hd_lo
  • Lower credibility
  • Less professional appearance
Design variations
  • Faster decision time
  • Lower credibility
  • Less professional appearance
  • Slower decision time
  • Higher credibility
  • More professional appearance

Continue reading “API reference topic study – summary results”

Is it really just that simple?

Photo of a tiny house. Is less more or less or does it depend?
A tiny house. Is less more or less or does it depend?
After being submerged in the depths of my PhD research project since I can’t remember when, I’m finally able to ponder its nuance and complexity. I find that I’m enjoying the interesting texture that I found in something as mundane as API reference documentation, now that I have a chance to explore and appreciate it (because my dissertation has been turned in!!!!). It’s in that frame of mind that I consider the antithesis of that nuance, the “sloganeering” I’ve seen so often in technical writing.

Is technical writing really so easy and simple that it can be reduced to a slogan or a list of 5 (or even 7) steps? I can appreciate the need to condense a topic into something that fits in a tweet, a blog post, or a 50-minute conference talk. But, is that it?

Let’s start with Content minimalism or, in slogan form, Less is more! While my research project showed that less can be read faster (fortunately, or I’d have a lot more explaining to do), it also showed that less is, well, in a word, less, not more. It turns out that even the father of Content Minimalism, John Carroll, agrees. He says in his 1996 article, “Ten Misconceptions about Minimalism,”

In essence, we will argue that a general view of minimalism cannot be reduced to any of these simplifications, that the effectiveness of the minimalist approach hinges on taking a more comprehensive, articulated, and artful approach to the design of information.

In the context of a well considered task and audience analysis, it’s easy for the writer to know what’s important and focus on it–less can be more useful and easier to grok. He says later in that same article,

Minimalist design in documentation, as in architecture or music, requires identifying the core structures and content.

In the absence of audience and task information, less can simply result in less when the content lacks the core structures and content and misses the readers’ needs.   More can also be less, when writers try to cover those aspects by covering everything they can think of (so-called peanut-butter documentation that covers everything to some unsatisfying uniform depth).

For less to be more, it has to be well informed. Its the last part that makes it a little complicated.

Carroll, John, van der Meij, Hans (1996): Ten Misconceptions about Minimalism. IEEE Transactions on Professional Communication, 39(2), 72-86.

Knowing your audience

Image of earth from space
Where to find our audience

For my PhD dissertation, I ran an unmoderated, online study to see how variations in page design and content of an API reference topic would affect how people found the information in the topic.

For the study, I solicited participants from several software development groups on LinkedIn and a few universities around the country. It’s definitely a convenience sample in that it’s not a statistically random sample, but it’s a pretty diverse one. Is it representative of my audience? I’m working on that. My suspicion, in the mean time, is that I’ve no reason to think it’s not, in that the people who read API documentation include a lot of people. For now, it’s representative enough.

A wide variety of people responded to the 750,000 or so software developers and people interested in software development that I contacted in one way or another. From those invitations (all in English, by the way), 436 people responded and 253 actually filled out enough of the survey to be useful. The 253 participants who completed the demographic survey and at least one of the tasks were from 29 different countries and reported speaking a total of 32 different native languages. Slightly less than half of the participants reported speaking English as their native language. After English, the top five non-English native languages in this group were: Hindi, Tamil, Telegu, Kannada, Spanish.

More than half of the participants didn’t speak English as their native language, but the vast majority of them should have no problem reading and understanding it. Of the 144 who didn’t speak English as their native language, 81% strongly agreed with the statement, “I can read, write, and speak English in a professional capacity or agreed or strongly agreed with the statement, “I can speak English as well as a native speaker.” So, while they are a very international group, the vast majority seem to speak English pretty well. The rest might need to resort to Google translate.

All of this supports the notion that not providing API documentation in any language other than English is not inconveniencing many developers—developers who respond to study invitations in English, at least. An interesting experiment might be to send this same survey to developers in other countries (India, China, Japan, LatAm, for starters) in their native languages to see how the responses vary.

So many studies. So little time.