3.09.2010

UPA Usability Metrics Workshop

I went to a great UPA workshop last month, presented by Bill Albert and Tom Tullis. It was a jam-packed day, covering a nearly overwhelming amount of content. Tom and Bill walked us through an overview of metric types and collection methods, and then concentrated on four different types of metrics: Performance, Self-Reported, Combined, and Observational.

One thing that struck me was the relationship between these metrics as they were presented. Performance and Self-Reported metrics are both gleaned through a structured interaction with a sample set of users. Combined metrics give an overall view of the health of an interaction, and are calculated using both Performance and Self-Reported metrics. Observational metrics are used less often than Performance metrics, and are seldom used in a Combined metric format.

Given that I've been working within a traditional survey-based market research environment for the past four years, I found the combination of these different types of performance indicators very interesting, especially given the practice of using self-reported data to create derived KPI (key performance indicators). It struck me that our rapidly changing access to data has the capacity to completely overhaul the way we approach measuring the success or failure of a product. Ultimately we must be able to demonstrate a profitable return on investment, but what happens when we have larger data sets to work with? Can we combine post-launch observational metrics with pre-launch performance tests to validate continuing development?

Market research and usability have their roots in the production and sale of products, not services. As we shift focus from the production of meatworld goods to the design of online services, we suddenly have the ability to approach larger and more disparate groups of consumers. With agile methodologies creating ever shorter development cycles, we have more opportunities to use our user communities as test subjects for working services. Google has famously tested 41 shades of blue for one of its toolbars, measuring click through rates to determine which shade is most appropriate. Services like Clickable provide the ability to make immediate decisions about search advertising ad copy. Users can see the real results on their pay per click online ads and use that information to refine their messaging (compare that to a six-week messaging study).  Granted, this is easy to do in a low-cost medium with infinite dimensions. (When a prototype takes at least a year to develop and costs at least a million dollars to produce, the barrier to this kind of live experimentation is high). Happily, we're living in a new age now, where interfaces and services and messages are expected to shift and grow and refine with changing consumer demand.

So what does this mean for usability metrics? For virtual services we will need to find a way to integrate those traditional metrics with live Observational metrics. Heck, we could even create a way to automate this type of reporting. I've got some ideas on how to do this, but that will have to wait for another day.

No comments: