Tuesday, January 16, 2007

Python and whitespace

I've introduced Python to many programmers, and most have had the same initial reaction: "What? Leading whitespace is significant? Oh, I ain't touching that." This has always struck me as odd. I have strong feelings about Python's use of block indentation, and this blog post by Paul Bissex last week reminded me of them. And so, here they are.

Many serious Python programmers will tell you using whitespace to delineate blocks is no big deal. I disagree. Lacking syntactically-significant whitespace is, in fact, a serious misfeature.

A bold statement, you say? Absolutely. And not one I would make without a strong argment to back it up. Here's the thing: every programmer I know uses indentation to indicate block structure. I know there are programmers out there who don't, but I feel comfortable stating that they are way, way in the minority.

If you accept it as given that the overwhelming majority of programmers use indentation to demarcate blocks, programming languages should treat initial whitespace as significant. Otherwise, it is possible to write code where the semantic meaning disagrees with the syntactic meaning, leading to bugs. Here's the canonical C/C++ example:
    if (something_is_true)
do_one_thing();
do_another_thing();
Many C/C++ programmers I know have made this mistake (myself included; in fact, I always wrap conditional blocks in { } nowadays for this very reason). Even if you are Joe Brilliant C++ Programmer and would never do anything this obvious and stupid, chances are good that the next guy who works on your code is not going to be as experienced or as smart as you. This mistake is not possible in Python.

By contrast, I have never heard a compelling argument not to have syntactically significant whitespace; most of the ones I've heard equate with "it's yucky". (Oh yeah, there's the usual bitching about spaces vs. tabs, and the "horror stories" of mixing the two. I have been using Python actively for about four years now, and A) I've been bitten by this "problem" less than a dozen times, B) it's always quickly become obvious what the problem is, and C) any reasonable editor can fix the problem in a matter of seconds.)

What it boils down to is a choice between distaste and correctness. And, frankly, having spent years maintaining code written by other people, I'll pick correctness every time.

Labels: , ,