
To help understand software refactoring, I want to introduce an analogy of a Room of Files.
Imagine you have a large room full of files, but which is a real mess. It has dozens and dozens of filing cabinets, and shelves, with hundreds and hundreds of paper files. And regularly, this is where you come to get your different files. But it’s a mess. A real nightmare to find what you want. Sure, there is some order within it. But the chaos is so bad, that...well, the order that there is within it...is very difficult, if not impossible, to spot.
That’s often what our software is like! Our software has some sort of order, but the disorder that has gradually accumulated has more and more taken over, until it becomes almost impossible to see the order through all the disorder.
Now imagine that you regularly have to come into this room and find a file or some papers that are needed. How easy is it? How much time does it take? It takes ages just to find where the thing is your looking for. Doesn’t that describe accurately how things are like in our software? Doesn’t that describe how often it takes to understand all the parts that are causing the latest bug? Oh, you think that things are different in your software! Are you sure? How long does it take you to track down bugs? How often do functions get rewritten in different places with different names...just because people didn’t realise they already existed. Somewhere in the code. Somewhere the programmer just didn’t know about.
I have too often worked on code where I have almost wanted to cry because something which should be just so simple...is just so difficult. Tracking down and finding a relatively simple bug can take a day or more, when really it should only take half an hour...or less. Why? Because the code is in such disorder.
And as for adding new functionality, adding new functionality to code in such a state can be nearly impossible.
It’s a thing difficult to objectively prove, but when I work on code which is really bad, I get the feeling that programming tasks take something like 10 times as long as if the code were in a good state. Yes, 10 times as long. Because the poorly structured code just makes it like that. A task which should take half an hour takes a day – that is 16 times as long by my reckoning.
Think of the Room of Files. If the room was extremely well-organised, how long would it take to find a file you wanted? A few seconds? Five or ten minutes max? And what about with the room in a really messed up state? A couple of hours? All day? Maybe you wouldn’t even find it after a day because you missed it first time through, and would have to go through everything all again, but this time a lot more carefully and slowly.
And that’s often what our software is like. A job which should only take a short time takes not just twice as long, but easily five or ten times as long.
Sometimes people see refactoring as a waste of time. The train of though has a certain logic to it. It goes like this. If I fix this bug, or add this new function, it will take me a certain amount of time, say one day. If I do refactoring as part of the task, the task will take me twice as long. Twice as long! You can imagine the non-understanding manager making you feel like you’ve asked to go on four weeks holiday just when the project is struggling to hit its deadline: What on earth would you want to waste your time like that!? You can do it in a day, but you’re going to take two days to do it...why, because you want to do some refactoring?! And what will be the benefit to the customer of this refactoring? The code is a little bit prettier inside!?!
I think refactoring, done properly, will typically mean that coding will (in the microscopic view of things) typically take twice as long. Why? Because you will be continually finding ways to improve the code. But note well, I said in the microscopic view of things, that is, when we look at a single isolated task. Indeed, in a way, it typically has to be the case that when you look at a single isolated task, doing some refactoring will make the task take longer. Why? Because almost by the definition of refactoring, the benefit will come when people work on this code afterwards. Yet that future may even be directly afterwards. I have said elsewhere that you don’t do refactoring if you are the last one to ever be touching a piece of code. But in the macroscopic view of things, when you look at the overall productivity of the programmer, it increases significantly. The programmer easily becomes say twice as productive, but more probably tasks can be done five or ten times as quickly or even quicker.
Refactoring code is like tidying up in the Room of Files. Sure, if I just search through the mess of files for what I’m looking for, it’ll take me a certain amount of time (a day say), and if I also spend time doing some tidying up it might take me twice as long (two days say) in total to find the file I’m looking for, but the room will be a bit more organised. Sure, I’ve lost time on this first search. But, even if I haven’t completely sorted out the room, I’ve gained loads of time on the next search.
And notice how I refactor (tidy and organise) the Room of Files. I don’t set aside a week or two, or even a month or two if needed, to really sort it out. I could. But in general there will not be the management support for that sort of full-time clear-up, even if that were the best way to do it.
Some people mistakenly think that because you adopt a policy of refactoring, you are going to be spending weeks on tasks which previously would have taken a day. That is not usually how one approaches refactoring.
What we do is at the time of actually doing some needed work (looking for a file in the example, fixing a bug or adding some new feature in software), we spend a proportion of our time in improving the ability to do our job in all future visits.
In fact even if there were the possibility to do a full-time clear-up, there would be a good argument against it: it is actually a lot easier to do the clear-up bit by bit whilst having a very concrete task such as looking for a file, than to simply go in and clear it up, as it were, just with the aim of clearing it up. If we do clear-up as part of another task, it is easier to stay focused on what clear-up is actually useful since you are searching for a particular item, and it is easier to stay motivated since you have in your mind at all times how this is really going to help to find this particular file, and other such files in the future.
See my book The Refactoring Workout for more on refactoring.
No comments:
Post a Comment