
In my last post I introduced the idea that software refactoring could be thought of as rather like dealing with a room of files.
Let’s take the Room of Files and work through how that gets tidied up (the full figures are shown in the table below). It is certainly true that this is only an illustration, and the figures have been chosen to come out as particularly favourable to the refactoring side of the story – that I will admit. However, I think as I run through this example with you, you will find it very instructive, because, though we might want to argue about the precise figures, many of the principles are much more incontrovertible. In any case, the likely benefits of any particular refactoring will always have to be judged on its own merits.

Let’s say, for arguments sake, that before the room is tidied it typically takes a day to find a file. Sometimes, when you get lucky, it takes only half-an-hour, sometimes as much as two days, but on average one day. So, one day is our current benchmark for finding files.
When we first go and hunt for a file, rather than spend a day looking for a file, we start by spending two days tidying and organising (=refactoring) the room. Then, we actually do the directly productive work which we set out to do – finding the file. Of course, the room is now already in a much better state than it was before we tidied it, so it might take us, say, 6 hours rather than the typical eight to find it. Nonetheless, rather than take just 8 hours to find the file, we have taken a total of 22 hours.
The next time we go and look for a file, rather than spend time directly looking, we first spend (say) another 6 hours tidying and organising the room. By now the room is really starting to improve. When we have finished tidying, then we actually do the directly productive work we set out to do – find the file. This time, rather than the already improved 6 hours, it is likely to take much less, say 4 hours. Nonetheless, yet again, the actual amount of time we have spent in total (6 hours of refactoring and 4 hours to find the file) is more than we might have spent if we simply came and looked for the file (8 hours). You will see however, that even if we leave the room in its current ‘partially refactored’ state, things have improved significantly over our initial situation. Now, each time we wish to find a file, even if we don’t spend any more time tidying, a file will typically take only 4 hours to find every time. The savings we have gained in terms of the vastly improved speed of finding a file, are paid for once, but benefit us each time we revisit the room.
This is exactly how it is with refactoring code. Whenever we improve the clarity of the code, and thus the speed at which future people can understand and change the code, the benefits of any improvement apply for every future occasion on which that code is worked on.
The next row of imaginary figures is very interesting. Here, on the 3rd visit, we actually spend 3 hours for refactoring, and 3 hours for finding our file. What is interesting is this: we could find our file now in an average of 4 hours, but yet we choose still to spend just as much time tidying before we look for the file, and, even with that being the case, we have still taken less time in total (6 hours) than in the scenario where we never do refactoring of the room (i.e. 8 hours).
Think about that for a moment. In coding terms, it is as if we have decided to spend just as long on refactoring as we will on the ‘actual’ coding, and yet, and here is the marvel of it – we have actually spent less time in total than we would have done under the scenario of not ever refactoring. We have spent twice as much time as we needed to get the job done (6 hours – 3 refactoring, and 3 finding the file), and yet we have still spent less time in total than the poor person who is working on the unrefactored room (8 hours).
I can imagine many of you now scurrying to the tables of figures trying to understand fully this apparent cloak-and-mirrors trickery, to try and understand the error of the logic.
There is no error in the logic. The apparent deception comes from the fact that the benefit of a refactoring is seen after the refactoring has been done, and this current visit to the room of files is really reaping the benefits of all previous times we have spent tidying the room.
What it does illustrate is that in order to understand the benefits of refactoring, we can never consider a single refactoring in isolation. Almost any refactoring when looked at in isolation will probably not be worthwhile – sometimes it happens that the time spent in refactoring is immediately repaid in the speed of making the change we want, but this is usually the exception rather than the rule. No, the real gains usually come later, in all subsequent changes.
Imagine trying to get managerial support for a particular change where you want to spend two days doing it rather than a few hours – because you “want to do some refactoring you think is needed”. It is difficult. Support for refactoring must be won first in a global discussion of its advantages when looked at in the global context rather than in a specific debate about a single refactoring, since any refactoring in itself, generally does not look worth doing, but it only becomes valuable when we think of the many times in future that the code will be worked on.
Let us skip in the table to the 7th visit to the room. Here you will notice that 1 hour is spent refactoring, and 1/10th of that in actually finding the file. What profligacy! Ten times as much time spent refactoring as actually doing the work? Of course we could never agree to that! You, my now learned reader will already see the fallacy of such a claim. You will know that refactoring, even at ten times the time spent on the ‘actual’ work, will soon enough reap its rewards.
From the 8th to the 11th visit I have shown no refactoring at all. There comes a point where there is little to be gained from further refactoring.
The other thing to note on these visits is the difference in time to perform our required task of finding a file: 0.1 hours versus 8 hours in the still untidy room, a factor of 80 times quicker. Is this realistic? Certainly I think for the room of files you can see that this would be perfectly reasonable; if you have a room of files in complete and total disorder it could easily take a whole day to find something, but yet, if the room was beautifully and logically arranged a file could almost instantly be found.
But what of our software? Can it really be that bad? Yes, I think the ratio sometimes is that bad. Yes, I do think that with badly decayed software, things can even take as long as 100 times as long! You can and understand it clearly with a very simplified example like this room of files, but the reality is that it is often time-sappingly bad within poor code too.
This huge amount of extra time needed to work on our software can apply from the smallest change to the largest. A bug will take several days to find and fix, rather than ten minutes (that’s a factor of 100 as near as makes no difference). A major change will take man-years rather than man-months.
On the 12th visit I have shown a little blip in the hours spent refactoring column. In the previous visits, no time at all was spent in refactoring and here, all of a sudden, another half-hour is spent in refactoring, without any apparent gain in subsequent times finding a file. Sometimes refactoring will be like this, that there are no immediate obvious gains in time. But any refactoring will be done because the person doing it perceives it as something useful for bringing more order to the situation. Perhaps they didn’t think of a way to improve things before; that’s fine, there are usually many visits to the same piece of code, and at each stage there is the opportunity to make improvements which suddenly become apparent, even if they hadn’t been thought of before.
Let us now look at the total time spent over all the iterations. When we look at the totals, for the refactoring scenario, we have actually spent twice as much time refactoring the room (30.5 hours) as we have actually finding a file (17.2 hours). But yet, when we compare the overall totals for working with the room in the two ways – a room of files with refactoring, and a room of files not refactored – the overall time where the room was treated with refactoring (47.7 hours) is half that where the room was not refactored (104 hours).
See my book
The Refactoring Workout for more on refactoring.