Writing Code With Text - What's That All About?

elegant chaos

February 04, 2011

Recently I’ve started some contracting work with a new client, and rather impressively they have a coding standard. This is most definitely a good thing.

However, this standard, like most, includes some stuff relating to text formatting, and it happens to be different from the way I like to do it.

I tend to agree with Sutter and Alexandrescu on this sort of thing. In paragraph 1 of item 0 of their C++ Coding Standards book, they say:

“Issues that are really just personal taste and don’t affect the correctness or readability don’t belong in a coding standard. Any professional programmer can easily read and write code that is formatted a little differently than they’re used to.”

They go on to say that you should try to use consistent formatting within a project, because jumping between styles is jarring. I agree in principle, but in practise, I’d go a little bit further than them. People are different. Experienced programmers can read code in pretty much any format. Experienced programmers also have years (or decades) of muscle memory which means that with the best will in the world, they’re likely to slip and use the formatting that they’re used to.

My advice would be - use common sense, and be forgiving. This cuts both ways of course. Whenever I’m working with other people’s code I try to adopt the style that seems to be prevalent in it. When I was younger I would dive in and reformat files - these days I just try to live with it. Having said that, when I write any substantial amount of new code, I’m not thinking about where to put the brackets, so I slip up and use my own style sometimes. What can I say? Live with it.

Thinking about all of this reminded me of something I’ve often thought. How bloody ridiculous is it that we’re still storing source code as plain text?

I don’t care if we input it as text, or view it as text - both are quite natural ways of expressing code (though things like Prograph prove that they are by no means the only way).

The thing I find annoying is that because we still store it as plain text, we can still get into arguments about whitespace, formatting etc, which is just ridiculous in this day and age.

If we stored it as structured text (eg xml), or some sort of binary format like a parsed syntax tree, then code editors would be free to present it to us for editing using our own formatting preferences, safe in the knowledge that it would be stored in a formatting-neutral way, so the next person coming along could view and edit using their preferences.

Of course, in theory we can do this now even with plain text source files, but in practise we can’t because everyone is so used to being able to open the text in any old editor and see exactly what they inputted, down to the last space or tab character. Which leads to polite disagreements about the position of curly brackets, etc, especially when one person accidentally changes what another person had done. Archaic or what?

And that’s not even to mention the obvious inefficiencies of parsing the bloody stuff every single time… we’re stuck in the 70s!

There are partial solutions to the formatting problem. For example, you can set up git hooks to reformat source files (using things like astyle) automatically when you grab them, and before you commit them. If I was running a coding shop that employed multiple people, that’s what I would do, as it would kill these sorts of arguments stone dead.

That sort of thing only works though if everyone buys into it. Automatic formatters tend not to pass the Turing test (!), so sometimes they get things wrong. If you’re running an automatic formatter on your machine alone, everyone else may not even realise it - and you will be perceived as an arrogant and ruthless re-formatter of everyone else’s code!

Anyway, I’m off - got lots of spaces to remove from my method declarations…