What to measure?

Creative Commons licenseThat measuring API documentation is difficult is one of the things I’ve learned from writing developer docs for more than 11 years. Running the study for my dissertation gave me a detailed insight as to some of the reasons for this.

The first challenge to overcome is answering the question, “What do you want to measure?” A question that is followed immediately by, “…and under what conditions?” Valid and essential, but not simple, questions. Stepping back from that question, and a higher-level question comes into view, “What’s the goal?” …of the topic? …of the content set? and then back to the original question, of the measurement?

For my dissertation, I spent considerable effort scoping the experiment down to something manageable, measurable, and meaningful–ending up at the relevance decision. Clearly there is more to the API documentation experience than just deciding if a topic is relevant, but that’s a pivotal moment in the content experience. The relevance decision also seemed to be the most easily identifiable, discrete event that I could identify in the overall API reference topic experience. It’s a pivotal point in the experience, but by no mean the only one.

The processing model I used was based on the TRACE model presented by Rouet (2006). Similar cognitive-processing models were also identified in other API documentation and software development research papers. In this model, the experiment focuses on step 6.

The Task-based Relevance Assessment and Content Extraction (TRACE Model) of document processing Rouet, J.-F. (2006). The Skills of Document Use: From Text Comprehension to Web-Based Learning (1st ed.). Lawrence Erlbaum Associates.
The Task-based Relevance Assessment and Content Extraction (TRACE Model) of document processing
Rouet, J.-F. (2006). The Skills of Document Use: From Text Comprehension to Web-Based Learning (1st ed.). Lawrence Erlbaum Associates.

Even in this context, my experiment studies a very small part of the overall cognitive processing of a document and an even smaller part of the overall task of information gathering to solve a larger problem or to answer a specific question.

To wrap this up by returning to the original question, that is…what was the question?

  1. The goal of the topic is to provide information that can be easily accessible to the reader.
  2. The easily accessible goal is measured by the time it takes for the reader to identify whether the topic provides the information they seek or not.
  3. The experiment simulates the readers task by providing the test participants with programming scenarios in which to evaluate the topics
  4. The topics being studied are varied randomly to reduce order effects and bias and participants see only one version of the topics to bias their experience by seeing other variations.

In this experiment, other elements of the TRACE model are managed by or excluded from the task.

Party like it’s 1995

Garmin GPS-III on table
My GPS-III Pilot getting to know the 21st century, again

A couple of days ago, I fired up my trusty Garmin GPS-III Pilot to take on a ham-radio trip. After it initialized and found itself, it declared the date to be Dec 2, 1995. I let it sit for a while, thinking that it might just be slow in waking up. After several hours, however, it remained convinced that Christmas 1995 was only just a few weeks away. I had another GPS (or 6) to use, so grabbed another one for the trip and I didn’t get a chance to investigate this temporal lapse until this morning.

It turns out, that this flashback was the result of a date-rollover error. Basically, the GPS uses a 10-bit number to count the number of weeks since January 6, 1980, when time began for GPS units. With 10 binary bits, you can represent 1,024 different things–weeks, in this case. After 1,024 weeks, after the GPS’ calendar returns to the beginning, back in 1980. This occurred in Aug, 1999, but was easily anticipated and GPS units manufactured shortly before that date could be programmed to accommodate the event by correcting for values that would result in a date that predated its manufacture. But, it seems that my GPS has lived long enough for that correction to no longer work as it did 15 years ago (i.e. the corrections to the current week values result in a date that’s reasonable to the GPS: 1995).

It turns out that many vintage GPS receivers manufactured in the mid to late 90s have similar problem, so it’s one that’s easily fixed by using a program written to update even older GPS units in preparation for the 1999 event. After running this program, my GPS-III Pilot is now living in the 21st century and is enjoying a morning in the sun on the back deck as it catches up with what’s been going on in GPS circles recently (i.e. it’s downloading the current satellite information from the GPS satellites).

This is interesting in that it’s a reminder of how even well designed software can surprise you, and it makes me wonder why I’m so attached to a GPS unit that’s almost 20 years old. It could be because we have had a lot of adventures together or it could be that I’m just a pack rat. Either way, we’re both sync’d up to the correct time.

UX Careers 2015 – Seattle

I went to the UX Careers 2015 panel last night to hear about the future of User Experience from a panel of UX veterans. There were about 200 people in attendance to hear about the future of the career. The panel had a mix of in-house and agency UX perspectives, which provided interesting answers to the questions presented by the host.

About half the discussion was about job hunting–what to do/not do in an interview. How to stand out from the [ever growing] crowd. How it’s a “job-seeker’s” market, yet the need for UX researchers is likely to start diminishing. While the answers seemed to be all over the map, more than one panelist remarked that there are many different contexts in which UX research & design are applied and that one-size is not likely to fit all.

The career guidance could best be summarized as: be yourself, know yourself, present yourself with confidence, and find the best fit for you.

The other half of the discussion was about UX trends–the most interesting question to me being, “What’s the next big thing in UX?” (or something to that effect). First off, there is no “next big thing,” just an ongoing evolution of little things and the same old thing (e.g. research and design methods) applied to different stuff.  One point that was made, which resonated with my post about the Internet of things, was how the focus should be on the human experience (human-human interaction) more than the human-machine interaction/experience. The notion of invisible experiences and transparent processes was also mentioned.

It was exciting to hear that the future of the best user interface is no user interface. Now that’s a user interface I’ll be able to sketch even with my horrid drawing skills. Of course the user experience of that invisible user interface will still take some [considerable amount of  research and] work to design, but at least the UI will finally be easy to sketch.

If anything, the session confirmed that we live in interesting times.

More goal setting

My blog vision is coming together. Shortly after my last update to the vision document, I realized I had another goal to add:

Limit blog posts to 500 words or less in size (about a 2-minute reading experience)

I’ve been aiming for this since the beginning–my intent being to keep each post succinct for the reader and force me to focus. As a result, I have a few posts still in the unpublished, draft state because they don’t meet that goal, but it’s been good practice.

In reviewing the earlier iterations of the blog vision, there seems to be some unarticulated elements. For example, the first goal was to generate content, but I’ve not set as a goal, develop an audience. While I do, eventually, want to develop an audience, my feeling is that I’ll need some set of content before promoting to an audience will be productive. So, the goal for this year is to accumulate that content. Next year, I might add “attract an audience” as a goal, but I don’t think I’m ready for that, yet.

The question that presents for this exercise is, should my long-term goal of developing an audience be included somewhere in the current vision document? That could be seen as a guiding, long-term goal, or a short-term distraction. As a long-term goal, it would help guide short-term decisions. At the same time, if having that in front of me might attract me to start trying to attract an audience before I have enough content to make it worth their while to stay–making it even harder to win them back, later.

I’ll need to think about that.

The Internet-of-things is not about things

Tracy Rolling wrote a detailed, first-person, customer experience report in Medium about the limitations of activity tracking. It’s not that we can’t track activities, but that the resulting information is something less than useful to the average (read, non-fanatic) consumer. In her article, she describes how her activity tracker can log all manner of data, but it really doesn’t provide much in the way of information that’s useful to her on a daily basis. Consequently, it should come as no surprise that activity trackers are abandoned after just a short period.  Her article highlights the value of the end-to-end customer experience.

Looking in from the outside (I don’t own an activity tracker), I don’t know that this fall into disuse is because the tracker puts itself out of a job (saying, in effect, “You’ve achieved your goal! Congratulations. My work is done.”) or it just doesn’t live up to its promise, whatever that was at the time it was bought. In either case, it highlights what seems to be an inherent challenge that the Internet-of-things hasn’t overcome: providing a durable customer value.

In spite of the recent hype, tracking things isn’t new. I’ve been connecting things to devices and vice versa for 35 years. Connecting them to the Internet is new, but recording telemetry is not. The fact that every person in the civilized world is carrying a powerful computer (usually in the shape of a telephone) is a recent phenomena, that one would think holds some promise. However, without a need (be it present or latent), connecting monitoring transducers to a smartphone or the Internet still seems like it’s in the solution-looking-for-a-problem phase of technology.

It’s not unlike the early days of cell phones or PCs–when the technology was a solution in search of a problem. Eventually, problems will be identified (or invented) and then connected with solutions. However, in the meantime, there’s much to learn about both.

API reference topic study – thoughts

Last month, I published a summary of my dissertation study and I wanted to summarize some of the thoughts that the study results provoked. My first thought was that my experiment was broken. I had four distinctly different versions of each topic yet saw no significant difference between them in the time participants took to determine the relevance of the topic to the task scenario. Based on all the literature about how people read on the web and the importance of headings and in-page navigation cues in web documents, I expected to see at least some difference. But, no.

The other finding that surprised me was the average length of time that participants spent evaluating the topics. Whether the topic was relevant or not, participants reviewed a topic for an average of about 44 seconds before they decided its relevance. This was interesting for several reasons.

  1. In web time, 44 seconds is an eternity–long enough to read the topic completely, if not several times. Farhad Manjoo wrote a great article about how people read Slate articles online, which agrees with the widely-held notion that people don’t read online. However, API reference topics appear to be different than Slate articles and other web content, which is probably a good thing for both audiences.
  2. The average time spend reading a reference topic to determine its relevance in my study was the same whether the topic was relevant to the scenario or not. I would have expected them to be different–the non-relevant topics taking longer than the relevant ones on the assumption that readers would spend more time looking for an answer. But no. They seemed to take about 44 seconds to decide whether the topic would apply or not in both cases.

While, these findings are interesting, and bear further investigation, they point out the importance of readers’ contexts and tasks when considering page content and design. In this case, changing one aspect of a document’s design can improve one metric (e.g. information details and decision speed) at the cost of degrading others (credibility and appearance).

The challenges then become:

  1. Finding ways to understand the audience and their tasks better to know what’s important to them
  2. Finding ways to measure the success of the content in helping accomplishing those tasks

I’m taking a stab at those in the paper I’ll be presenting at the HCII 2015 conference, next month.

Setting a goal

Continuing on my site/personal vision and goal setting exercise (picking up from my last post on the topic), I updated my vision document with my first goal. Goals should be SMART:

  • Specific
  • Measurable
  • Achievable
  • Realistic
  • Timely

So, with that in mind, I added my first goal:

Build base of content – at least 1, ideally 2, new topic(s) per week

My vision document is still a work in progress, but this is definitely a goal that fits within the vision and principles and is SMART.

I still have more things to consider. The biggest elephant in the room is the portfolio that I’ve managed to avoid for, well, for the past 35 years. Likewise, I want to add more CV material, etc. to tell more about me, but I need to keep the Achievable and Realistic elements of the SMART mnemonic in mind, while not ignoring the Timely one.

It might only be one step at a time, but it’s one more step.

Checklist for technical writing

Devin Hunt's Design hierarchy
Devin Hunt’s design hierarchy

Devin Hunt posted this figure from “Universal Principles of Design,” which is an adaptation of Maslow’s Hierarchy of Needs for design.  It seemed like they could also apply to technical writing. Working up from the bottom…

Functionality

As with a product, technical content must work. The challenge is knowing what that actually means and how to measure it. Unfortunately, for a lot of content, this is fuzzy. I’m presenting a paper next month that should help provide a framework for defining this, but, as with Maslow’s triangle, you must do this before you can hope to accomplish the rest.

For technical content, like any product, you must know your audience’s needs to know what works means. At the very least, the content should support the user’s usage scenarios, such as getting started or onboarding, learning common use cases, having reference information to support infrequent, but important, usage or application questions. What this looks like is specific to the documentation and product.

Reliability

Once you know what works means, then you can tell if it does and determine if it does so consistently. Again, this requires knowledge of the audience–not unlike product design.

This is tough to differentiate from functionality, except that it has the dimension of providing the functionality over time. Measuring this is a matter of tracking the functionality metrics over time.

Usability

Once you know what content that works looks like, you can make sure it does so consistently and does so in a way that is as effortless as possible.

Separating usability from functionality is a tough one  in the content case. If content is not usable, does it provide functionality? If you look close, you could separate them out. For example, a content set can have all the elements that a user requires but they can be difficult to find or navigate. Likewise, the content might all exist, but be accessible in a way that is inconvenient or disruptive to the user. As with product development, understanding the audience is essential, as is user testing to evaluate this.

Proficiency

Can readers become expert at using the documentation? One could ask if they should become experts, but in the case of a complex product that has a diverse set of features and capabilities, it’s not too hard to imagine having a correspondingly large set of documentation to help users develop expertise.

What does this look like in documentation? At the very least, the terms used the documentation should correspond to the audience’s vocabulary to facilitate searching for new topics.

Creativity

Not every product supports creativity, nor does every documentation set. However, those that do make the user feel empowered and are delightful to use. A noble, albeit difficult, goal to achieve, but something worthy of consideration.

This might take the form of community engagement in forums, or ongoing updates and tips to increase the value of the documentation and the product to the audience.

Getting past authentic

I’m still working on the blog’s vision and goals and it occurred to me why authentic was such a sticky wicket–the meaning has been stretched some. To me, it’s summed up as “what you see is what you get,” and therein lies the problem: I don’t look like much, unless you know where to look, I suppose.

The challenge comes when having had to choose between making an impression or making an impact, I’ve preferred to make an impact. Sometimes, on a good day, the impact is what makes an impression. Oftentimes, however, the impact comes at the cost of making an impression, or at least an immediate impression. Sometimes, to be completely honest, I strike out and make neither (or worse). Those, I chalk up to live and learn, and try not to repeat them.

Back to the blog. If I aspire for the blog to have a positive impact and make an impression, but can I do that and be authentic?  I think so, as long as the impression comes from the impact. In a world that can’t see past the impressions, however, that’s going to come with a cost. But it’s a cost that’s lower, in the long run, than optimizing for impression over impact.

I think I can get past authentic now that I’ve operationalized it more clearly.

With that, I’ve updated my vision document.

In this latest update, I:

  • Added a new audience segment: Amateur radio. How could forget that?
  • Edited the vision to engage in a conversation, not just contribute (i.e. toss things into) one.
  • Added a new principle: to strive for craftsmanship.

The last one is a personal goal as well and speaks back to how I’ve operationalized authentic. I want this work to have a clean and professional sense about it. If it doesn’t now, I want it to work towards that goal as I go along.

Interesting. By clearing up one principle, I was able to reveal another.

Cool.

API reference topic study – summary results

During November and December, 2014, I ran a study to test how varying the design and content of an API reference topic influenced participants’ time to decide if the topic was relevant to a scenario.

Summary

  • I collected data from 698 individual task scenarios were from 201 participants.
  • The shorter API reference topics were assessed 20% more quickly than the longer ones, but were less credible and were judged to have a less professional appearance than the longer ones.
  • The API reference topics with more design elements were not assessed any more quickly than those with only a few design elements, but the topics with more design elements were more credible and judged to have a more professional appearance.
  • Testing API documentation isn’t that difficult (now that I know how to do it, anyway).

The most unexpected result, based on the literature, was how the variations of visual design did not significantly influence the decision time. Another surprise was how long the average decision time was–almost 44 seconds, overall. That’s more than long enough to read the entire topic. Did they scan or read? Unfortunately, I couldn’t tell from my study.

Details

The experiment measured how quickly participants assessed the relevance of an API reference topic to a task-based programming scenario. Each participant was presented with four task scenarios:  There were two scenarios for each task: one to which the topic applied and another to which the topic was not relevant and each participant saw two of each. There were four variations of each API reference topic; however, each participant only saw one–they had no way to compare one variation to another.

The four variations of API reference topics resulted from two levels of visual design and two levels of the amount of information presented in the topic.

Low visual design High visual design Findings:
Information variations
High
information
copy_ld_hi copy_hd_hi
  • Higher credibility
  • More professional appearance
Low
information
copy_ld_lo copy_hd_lo
  • Lower credibility
  • Less professional appearance
Findings:
Design variations
  • Faster decision time
  • Lower credibility
  • Less professional appearance
  • Slower decision time
  • Higher credibility
  • More professional appearance

Continue reading “API reference topic study – summary results”