In Defense Of Implicit Code In C++

tl;dr: When C programmers start using RAII in C++, they’re less productive at first because they don’t think of return; as cleaning up and returning, just returning. They blame the language, but they just need to adjust their mental habits a little. The problem isn’t C++ (it’s got lots of other problems), it’s just that they’re trying to write C in another language. Then you can keep the resource handling code separate from your main logic, you’ll get it correct a lot sooner, etc.

At work we’re having a debate on what language we should use to implement our software, C or C++. There has been a lot written about why C++ sucks, and it really is a bloated language with a lot of traps for programmers at all levels of experience. But I believe that when it comes to writing good, efficient, system-level code, using the right set of C++ features can make your code better. And Resource Acquisition Is Initialization is one of those.

In RAII, the compiler inserts implicit calls to destructors, so you don’t need to remember them, and therefore can’t forget them. But this can lead to confusion, because now it’s harder to know what a particular line of code does. People with a C background look at a statement like “return;” and think “that returns from the function,” not “it cleans up and returns.”  And until they change that mental habit, RAII is murky, confusing and full of gotchas.

Like many things in life, the implicit thing has advantages and disadvantages. Being explicit has the benefit that every line of code is “self contained.” To know what it does, you only need to look at that line of code. At a previous company, we had some horrible implicit code, where a[n] = b[n] created all sorts of temporaries and did all sort of magic under the hood, because operator[] and operator= were overloaded. That style of code was promoted by those who were quick to perceive the benefits of abstraction, but slow to realize its costs. To understand the performance of that one line — where a and b were rows of a matrix — you had to find and understand 5 different classes. That was just too much for most people, and so a lot of sloppy code was written, which we spent months trying to understand and speed up.

So there’s a cost to implicit, but what problems does it let you avoid? With explicit resource management, there’s a convention that when a function allocates some resource, then later encounters an error, it should free the resource before returning. With C, you can’t tell the compiler that explicitly, so you have to do the work of putting in the free() call in all error paths. This means you might forget to put it somewhere, or if someone adds a new error handler they might forget to call free(), or if they reorder the code they might overlook a call to free() they should have added, etc. So while explicit makes it easy to see what a line of code does, it doesn’t make it easy to see if there’s anything missing.

It also means the code that implements separate concerns are mixed together, making it harder to get any one of them correct. The logic for allocating and freeing is mixed in with the main work of the function, as well as error checking. If you’re looking at the code and thinking about the typical, non-error case, it’s easy to overlook a problem with error detection or resource management. So when I’m writing or reading the code, I find it hard to understand all aspects of the code in a single pass. Instead, after I’ve read the code once, I need to do a separate pass of “now did they remember to call free() everywhere?” With practice, you can keep a stack in your head of all the things allocated up to this line of code, and a mental checklist of all the error conditions you might want to check. But that’s a set of mental habits that we need to develop, and before we develop them, we get a lot of leaks and missed error checking.

An alternative is to specify all the steps in one place, e.g. a class definition, then have the compiler insert them for you. That’s what happens with std::unqiue_ptr, or a class to lock and free a Mutex. So it means that when variables go out of scope, things happen that aren’t explicitly listed in that line of code, so return ; can perform arbitrary computation, as can }. That means you need to change the way you look at such statements, because new code conventions develop. But just because those code conventions are new, and different from what you’re used to, doesn’t mean they’re bad. Any tool takes a little getting used to, and until you do, there will be confusion and gotchas. I’d say the mental habits of the “implicit” approach are different — and easier for people to learn. Which is why I like the implicit approach in moderation.

So it’s true you have to learn something about the order of constructors and destructors to get std::unique_ptr right. But it’s actually pretty easy, because it’s designed to do the obvious thing as much as possible.

Advertisement
This entry was posted in Brain Rental. Bookmark the permalink.

1 Response to In Defense Of Implicit Code In C++

  1. mortoray says:

    I think that so long as what is happening implicitly makes sense the typical programmer will understand. However, there are a lot of ways to abuse this implicit code, either through destructors, or operator overloading, and produce garbage. Of course, it’s quite easy to produce garbage in pretty much any language.

Comments are closed.