Bloatware is a law of nature. Understanding it can help you avoid it

Today software can be churned out with an impressive speed, but few have stopped to ask the question of whether all the features they build were really necessary in the first place. Lean start up, Agile, Dev-Ops, automated testing etc. are frameworks that have made it possible to develop quality software  at impressive speeds. Are all the features they build really used by real users or were they just clever ideas and suggestions. Not too much research exists, but the Standish Groups CHAOS Manifesto from 2013 has an interesting observation on the point.

“Our analysis suggests that 20% of features are used often and 50% of features are hardly ever or never used. The gray area is about 30%, where features and functions get used sometimes or infrequently. The task of requirements gathering, selecting, and implementing is the most difficult in developing custom applications. In summary, there is no doubt that focusing on the 20% of the features that give you 80% of the value will maximize the investment in software development and improve overall user satisfaction. After all, there is never enough time or money to do everything. The natural expectation is for executives and stakeholders to want it all and want it all now. Therefore, reducing scope and not doing 100% of the features and functions is not only a valid strategy, but a prudent one.”

CHAOS Manifesto 2013 

20% of features are the most often used. It looks like the Pareto principle is at work here. The Pareto principle states that 80% of the effect comes from 20% of the causes. Many things have been described with it from the size of cities to wealth distribution  to word frequencies in languages. There has even grown an industry from it based on the bestselling book “The 80/20 Principle: The secret of achieving more with less” by Richard Koch. Other titles expand on this:  “the 80/20 manager”, “The 80/20 Sales and Marketing” and the “80/20 Diet”.

This could seem a bit superficial and you would be forgiven for thinking whether there really is any reality to the 80/20 distribution. It could just as well be a figment of our imagination, an effect of our confirmation bias; we only look for confirming evidence. Never the less, it seems that there is solid scientific ground when you dig a bit deeper.

The basis for the 80/20 principle
The Pareto principle is a specific formulation of a Zipf law. George Kingsley Zipf (1902-1950) was an American linguist. He noticed a regularity in the distribution of words in a language. He looked at a corpus of English text and noted that the frequency of a word is inversely proportional to its rank order. In the English language the word “the” is the most frequent and thus has rank order 1. It accounts for around 7% of all words. “Of” is the second most frequent word and accounts for 3,5 %. If you plot a graph with the rank order as the x-axis and the frequency as the Y-axis you will get the familiar long tail distribution, that Chris Anderson has popularised.

One thing to notice at this point is that the 80/20 distributions is relatively arbitrary. It might as well be 95/10 or 70/15. What is important here is the observation that a disproportionately large effect is obtained from a small amount of observations.

While Chris Anderson’s point was that the internet opened up for businesses opportunities in the tail, that is, for products that were sold relatively infrequent, the point for software development is the opposite, to do as little as possible in the tail.

Optimizing product development
We can recast the problem applying Zipf’s law. Take your planned product and line up all the features you intend to build. The most frequently used will be used twice as much as the second most used, and three times as much as the third most.

In principle you could save a huge part of your development efforts if you were able to find the the 20% features that would be used the most by your customers. How would you do that? One way is the lean start up way which is reaching mainstream. Here the idea is that you build som minimal version of the intended feature set of your product. Either by actually building a version of it or by giving the semblance of it being there and monitoring whether that stimulates use by intended users.

This is a solid and worthwhile first choice. There are however reasons why this is not always preferable. Even working with a Lean start up approach you have to do some work in order to test all the proposed features. That amount of work need not be small. Remember the idea of a Minimal Viable Product is just that it is minimal with regard to the hypotheses about its viability. Not necessarily a small job.

The Minimal Viable Product could be a huge effort in itself. Take for example the company Planet Labs. Their MVP was a satelite! It is therefore worthwhile to consider even before building your minimal viable product what exactly is minimal.

Ideally you want to have a list of the most important features to put into your MVP. That way you will not waste any effort on features that are not necessary for an MVP. The typical way this is done is for a product manager, product owner or the CEO to dictate what should go in to the MVP. That is not necessarily the best way, since their views could be idiosyncratic.

A better way
A better way you can do this is by collecting input on possible features to include from all relevant stakeholders. This will constitute your back log. Make sure it is well enough described or illustrated to be used to elicit feedback. Here you have to consider your target group and the language they use and the mental models they operate with.

Once you have a gross list of proposed features the next step is to find a suitable group of respondents to test whether these features really are good. This group should serve as a proxy for your users. If you are working with personas, you should find subjects that are similar to your core personas. Then you will simply make a short description of the intended product feature or even illustrate it, list the proposed features and ask the subjects in a survey or some similar fashion “If this feature was included in the product how likely is it that you would use it? On a scale from 1-5″

Once you have all the responses for every feature simply calculate the score by adding all the ratings they got. Then you can follow Zipfs lead and rank features from top to bottom. If you calculate the total of all scores you can find the top 20% features. Simply start with the highest scoring and continue until the cumulative score of features approaches 20% of the total score. It is still however a good idea with a sanity check, so you don’t forget the login function or similar (you can trust algorithms too much)

What to do
Now that you have saved 80% of your development time and cost, you could then use the effort to increase the quality of the software. You could work on technical debt, to make it more robust while you wait for results.

You could also use this insight in your product intelligence and look at the top 20% most frequently used features of your product. Once you have identified them optimize them so they work even better. That would be a short cut to getting happier customers. You could optimize response times for these particular features so the most important features work faster. You could optimize the visibility in the user interface, so they are even more easy to see and get to or you could be used the insight in marketing to help focus the positioning of your product and to communicate what it does best.

To sum up, product utilization seems to follow a Zipf law. Knowing the top 20% features could help you focus development effort, but it could also help you focus marketing effort, user interface design and technical architecture.



Richard Koch: “The 80/20 Principle: The secret of achieving more with less

Chris Anderson: “The Long Tail

Photo by flickr user mahalie stackpole under CC license