Why Web developers should care about analytics

I’m pretty sure the universe is trying to teach me something. For as long as I can remember, I’ve been dismissive of Web analytics. I’ve always felt that they’re for marketing people and that, at least in the realm of personal publishing, paying attention to analytics makes you some kind of sellout. Analytics is a discipline rife with unfounded claims and terrible, terrible products, as well as people engaging in cargo cultism that they pretend is analysis. Even the terminology is annoying. When people start talking about “key performance indicators” and “bounce rate” my flight instinct kicks in immediately.

In a strange turn of events, I’ve spent most of this year working in the field of Web analytics. I am a huge believer in making decisions based on quantitative analysis but I never connected that to Web analytics. As I’ve learned, Web analytics is just quantitative analysis of user behavior on Web sites. The problem is that it’s often misunderstood and usually practiced rather poorly.

The point behind this post is to make the argument that if you’re like me, a developer who has passively or actively rejected Web analytics, you might want to change your point of view. Most importantly, an understanding of analytics gives the team building for the Web a data-based framework within which they can discuss their goals, how to achieve those goals, and how to measure progress toward achieving those goals.

It’s really important as a developer to be able to participate in discussions on these terms. If you want to spend a couple of weeks making performance improvements to your database access layer, it helps to be able to explain the value in terms of increased conversion rate that results from lower page load time. Understanding what makes your project successful and how that success is measured enables you to make an argument for your priorities and, just as importantly, to be able to understand the arguments that other people are making for their priorities as well. Will a project contribute to achieving the overall goals? Can its effect be measured? Developers should be asking these questions if nobody else is.

It’s also important to be able to contribute to the evaluation of metrics themselves. If someone tells you that increasing the number of pages seen per visit to the site will increase the overall conversion rate on the site, it’s important to be able to evaluate whether they’re right or wrong. This is what half of the arguments in sports statistics are about. Does batting average or on base percentage better predict whether a hitter helps his team win? What better predicts the success of a quarterback in football, yards per attempt or yards per completion? Choosing the right metrics is no less important than monitoring the metrics that have been selected.

Finally, it often falls on the developer to instrument the application to collect the metrics needed for analytics, or at least to figure out whether the instrumentation that’s provided by a third party is actually working. Again, understanding analytics makes this part of the job much easier. It’s not uncommon for non-developers to ask for metrics based on data that is extremely difficult or costly to collect. Understanding analytics can help developers recommend alternatives that are just as useful and less burdensome.

The most important thing I’ve learned this year is that the analytics discussion is one that developers can’t really afford to sit out. As it turns out, analytics is also an extremely interesting problem as well, but I’ll talk more about that in another post. I’m also going to revisit the analytics for this site, which I ordinarily never look at, and write about that as well.

Good post — strong agreement on the “developers can’t really afford to sit out” conclusion.

One extension to your metric evaluation point: statistics and even generally analytic skills are not widely taught in most educational programs. Unless you work with a bunch of scientists, economists or non-software engineers, it’s often the case that your most valuable contribution can simply be making sure that the numbers being uses actually reflect what you intended to measure rather than a statistical mistake or an artifact of the data source.

Just to use a trivial example, basing decisions on Google Analytics can extremely hazardous because they use averages exclusively rather than more robust calculations (medians, n-th percentile, etc.) which aren’t so vulnerable to outliers. I posted an example I found in May where 3 abnormal samples out of 200K distorted Google Analytic’s reported site load time by an order of magnitude. The timing indicators seem to be the least reliable but everything using average is affected because somehow a grade-school math mistake made it into the most popular analytics package on the market.

(My wife is a scientist-turned-teacher and has some better, jaw-dropping examples of decisions made by people who didn’t control for selection bias or confused correlation and causation, particularly scary given how much is riding on stats at schools in the United States)

2 Comments

ben
September 29, 2012 at 10:30 am

It’s not that analytics isn’t worth getting into. Rather…

Doing it right requires a whole helluva lot of dedicated study and time away from the fun stuff that impresses people.

On the other hand, analytics makes genuinely objective A/B testing possible.
Chris Adams
September 29, 2012 at 12:44 pm

Good post — strong agreement on the “developers can’t really afford to sit out” conclusion.

One extension to your metric evaluation point: statistics and even generally analytic skills are not widely taught in most educational programs. Unless you work with a bunch of scientists, economists or non-software engineers, it’s often the case that your most valuable contribution can simply be making sure that the numbers being uses actually reflect what you intended to measure rather than a statistical mistake or an artifact of the data source.

Just to use a trivial example, basing decisions on Google Analytics can extremely hazardous because they use averages exclusively rather than more robust calculations (medians, n-th percentile, etc.) which aren’t so vulnerable to outliers. I posted an example I found in May where 3 abnormal samples out of 200K distorted Google Analytic’s reported site load time by an order of magnitude. The timing indicators seem to be the least reliable but everything using average is affected because somehow a grade-school math mistake made it into the most popular analytics package on the market.

(My wife is a scientist-turned-teacher and has some better, jaw-dropping examples of decisions made by people who didn’t control for selection bias or confused correlation and causation, particularly scary given how much is riding on stats at schools in the United States)

rc3.org

Strong opinions, weakly held

Why Web developers should care about analytics

2 Comments

Leave a Reply Cancel reply

Recent Posts

Details

rc3.org

Strong opinions, weakly held

Why Web developers should care about analytics

Previous post

Next post

2 Comments

Leave a Reply Cancel reply

Recent Posts

Details