More articles on API documentation

I’ve just collected some more articles for my bibliography of API documentation-related articles and the trend I saw earlier this year hasn’t changed much. In all fairness, eight months is probably not enough time to see a change given the pace of academic publishing. I now have 114 articles in my list of API documentation-related topics.

114!

Searching for API DOCUMENTATION produces a lot of hits on actual API documentation (good news: there’s a lot of API Documentation out there!) Searching for Writing API Documentation produces more articles relevant to what I’m looking for. I’ve also merged my academic and non-academic API documentation bibliographic data, so I can compare and contrast them together.

The merged lists have these characteristics:

114 articles! and I know I haven’t found them all, yet.

71% (81/114) of the articles are from CS-oriented sources
29% (33/114) are from TC-oriented sources
81% of the CS-oriented articles are from edited publications (books, journals)
27% of the TC-oriented articles are from edited pubs.
27% (31/114) of the API documentation articles were published in 2017

So, what does this mean?

I don’t have my stats software nearby so my analysis tools are limited, but some impressions that are worth looking into when I get back to them are:

Quantifying the amount of or the lack of cross-referencing. While my papers (reference CS articles, they also refer to design and other non-computer science references. The CS articles seem to cite only other CS articles. That still seems odd, but it’s also still just an impression.
The research methods used by the different disciplines. Quant? Qual? Case study? It might be interesting to see if there was a preference one way or the other.
Publications by year and by discipline to see if there are any particular patterns.
The themes or topics that each group prefers. This presents a challenge found in qualitative research and that’s how do you define the groups and how do you determine into which a particular topic belongs.

The last point touches on the challenge of turning this into an academic study and that’s, “what’s the research question?” And then, what might be some of the conclusions that fall out from this? I’m still pondering this. Further, what are the definitions and selection criteria of “CS-audience,” “CS-publication,” etc. There at a lot of things to nail down before this is ready for publication and citation. I’m still in the “getting my bearings” stage.

Nevertheless, some things seem pretty clear, for example,

API Documentation is being discussed more and more frequently in academic CS conferences (i.e. more than zero).
Looking at the authors of the CS articles, they seem to show a reasonably healthy mix of industry and academic authors collaborating on the papers (another impression not yet supported by data). Maybe there’s something TC could learn?

I still find it curious how TC talks about API in informal publications more than formal, edited articles; while CS talks about it more in formal articles. Interest bias? Importance bias? Selection bias? It could be that this is a topic more germane to graduate studies in CS than in Tech Comm. I’m still pretty sure I’m the only one who studied API documentation for their TC/HCDE thesis (If there’s another one of you out there, please say, “Hi!”).

And, that brings me to the known weaknesses of these observations.

Selection bias

Selecting the articles is biased by (at least):

The search term and Google’s search algorithm. I haven’t tried all possible search terms (I haven’t even tried coming up with such a list).
After Google’s selection bias, I look at the titles to identify a relevant article so, some articles could be poorly titled and end up on the cutting-room floor.
After that, I look at the seemingly relevant articles to understand them and I reject topics such as actual API documentation (e.g. reference topics).
I also reject promotional topics with no constructive informative value. For example, I would reject a topic along the lines of Use our tool to simplify these 10 best practices, but not “10 best practices for API documentation” by a consultant who clearly wants you to read this and give them a call. Admittedly, a very fuzzy line.
What survives ends up on my list.

And after all this, I still leave out a lot of work like all the related articles from Tom Johnson’s blog. Do I count that as individual articles or do I group them as a single publication (e.g. a book). I also don’t list Magazines that don’t appear in the Google searches, such as the issue of the STC’s Intercom that was devoted to the topic. So, it’s far from perfect.

Say again?

Something that I’ve noticed is that many of the web articles repeat a lot of the same ideas. In the aggregate, I don’t see much new information being presented in the web articles. This is in contrast to the more curated content which seems to have a greater diversity of topics and information.

From a practitioner’s perspective, I can see how repetition would seem to confirm that these 10-best practices must really be the 10 best if everyone is repeating them. Despite the fact that this is a case of the false-consensus bias, I understand the appeal. Such repetition might provide some (if not a false) sense of validation, but it doesn’t move the discipline forward and drowns out the new ideas and discoveries.

Again, these are just my impressions from swimming with the data for a while.

I’m still pondering this and ping me on Twitter if you have some ideas for what might be interesting among all this. In the meantime, here is some eye candy.

So, what does this mean?

Selection bias

Say again?

Leave a Reply