Programmer's Blues: August 2010

Monday, August 23, 2010

The Marginal Unit of Testing

Dear Blogspot,

Here's a puzzle for you. If society benefits from the availability of water, why is a cup of water not provided everywhere? Sure, they will offer you one at eating establishments, but why not at bookstores, or electronics stores? Granted water is not free, but if it is so necessary for human survival, you'd think we would demand a glass upfront at every venue, just like we demand air to breathe wherever we go.

The reason we don't has to do with the costs versus the marginal benefit. Radio Shack would need to spend capital and recurring expenses to be able to offer something that most people would not even want to consume were it offered there. Meanwhile, the demand for drinks at restaurants is extremely high, due to the complementary good of Food that is consumed there. Put simply: the marginal benefit of that next glass of water, at most venues, is well below the cost of provision. Not at every venue, but at most.

This important economic insight applies equally to an important (and recently Hallowed and Revered, thanks to T.D.D.) object such as the Unit Test. A unit test, of course, is programming code written to test the functionality of other programming code. It is distinguished from integration or functional testing precisely because of the small segments of code that are the target of the unit tests. Unit Tests are almost always written by the same programmer who wrote (or will write) the code being tested.

Like water, unit tests are important. Also like water, the provision of unit tests is not free. However, unlike water, far too many developers (especially the last 5-10 years) would be Horrified to discover even one "venue" where unit tests are not provided for. This reflects, in my humble opinion dear Blogspot, a failure to think on the margin.

Like with Water and Radio Shack, spotting the most cost-inefficient places for unit tests is pretty easy: getters and setters, code that does trivial calculations, code that generates logs for developer purposes, or code whose failure is more cheaply spotted by integration testing (such as user interfaces), etc. Below this it gets rather fuzzy -- one might argue that code that has a high call count, has numerous dependencies, is modified frequently, has failed previously, or that performs a function whose failure would be disastrous are all great candidates for unit tests.

Either way, treating Unit Tests like Water and reasoning that "because they are important, they must be equally demanded in all situations" is both fallacious (see the Fallacy of Composition & Division) and a wasteful use of resources.

Thursday, August 5, 2010

Price Shopping

Dear Blogspot,

Today I looked at web site statistics. I saw interesting data, including numbers of page views, typical page navigation flows, and similar metrics. The web developer then tied those metrics directly into how his automated web site testing tool was configured. The output from the testing tool, combined with the metrics, then directly informed the time he spent improving and optimizing specific aspects of the web site. This started me thinking about whether such statistics might be useful in directing all web developer tasks.

In markets, prices are signals of public preference. Higher prices signal higher preference, and often cause more production effort in that direction. For a free public web site, especially one dedicated to an unreleased product, there are no prices anywhere to be found! Therefore, like the apparatchik economists of the 20th century, web developers have only statistics to approximate this function.

So, does this work? Is this a good approximation? Not even close...

In the first place, while usage statistics reflect demand, the only cost incurred by the free web page viewer is their time. Suppose that an attempt was made to make the site profitable by charging the user a fixed price for each page click. The (few) remaining web users viewing habits would change considerably, and the usage statistics would be radically altered. Higher value pages would gain considerably relative to lower value pages. Users would likely use bookmarks to skip index/portal-like pages in favor of going directly to the pages they want most.

In the second place, while usage statistics aggregate access, they do not approximate the intensity of the desire for the various pages viewed. In economics, this is called price elasticity. This is why, in our hypothetical pay-per-click site, the fixed page price would cause those pages that people value below the fixed price, such as the index and portal-like pages, to lose traffic relative to other pages.

Lastly, and this goes back to a previous blog post, demand determined from site statistics can not be used to determine the profitability of spending more developer time on even the most-viewed pages. This is due to the difficulty in comparing the market-cost of a developer's time with the time spent on particular pages. Perhaps maintaining the most popular pages have developer time costs that make such work unwarranted. Perhaps the developer should spend his time instead on NEW pages. Without prices at both ends of the equation, it's impossible to say.

Web statistics are very useful things. For the purposes that this particular web developer was putting them to, they were very near perfect. However, if such direct demand signals are otherwise useless in directing web developer tasks, how much harder is it to determine the best use of the time of non-web developers, who don't even have that?