Should You Use Nested Functions to Encapsulate Logic?
Sometimes simple questions lead you to an interesting trail of thoughts. This one was “should you use nested functions to encapsulate logic only used once?”
First of all, a small refresher, nested functions are functions written within a scope of another function. They have many uses in python.
But for this question, the asker was interested in the case where some logic which is used only within foo can be encapsulated within its scope.
When coding we want to hide immaterial detail. E.g. we mark functions as private as much as we can (well in python we can’t really, so we use a leading underscore convention _like_this). So that’s a good reason to use nested functions — help the reader understand that the logic of bar will not be used anywhere else.
However beyond considering the benefits of something, we should also consider the cost. So what are the costs here?
- Extra indent making it hard to read
- Adding a bunch of names to juggle when reading the scope of bar.
- Making the dependencies of bar hard to reason about.
The first point needs no further explanation I believe. Anyone who programmed in python has tasted the double edged sword of minimizing noise by eliminating brackets (hint: brackets are not noise…).
As for the multitude of names, without training, humans can juggle 1–2 balls in the air. With some consistent training they can do 3. Some talented ones can do 5, but it’s vey rare for someone to go that far.
We should program with that limitation in mind. To make it into a rule of thumb — there shouldn’t be more than 3 names I’m carrying around in my mind when looking at a scope. In the nested function case when reading the scope of bar I now have to carry a, b and c in mind, needlessly.
As for the last point, in the nested version, I now have to understand that bar doesn’t depend on a, instead of it being obvious (well it might be obvious in this example, but consider a less trivial implementation of bar).
So really this version has much more information about it:
The feeling should be that of a mental relief, I can free up some worries when reading parts of this code, whereas previously my mind was clenched, juggling, worrying I might not notice something.
The previous version constitutes a loss of information. This information can still be understood by inference, but energy has to be invested. Note that one direction is easy, it’s easy to go from this versrion to the previous one, but it’s considerably harder the other way around — some reasoning is involved. This is no surprise, it’s much easier to mess a room up than tidy it.
This feeling might be familiar from doing refactoring — you discover things about your code, and this is an often painful process, because so much is hidden and needs to be realized.
If the word entropy comes to mind, you grasped what I’m getting at. Writing good code means minimizing entropy. You do so by inferring things about your code, and structure it to be simpler using these inferences. This is measured best by the size of your codebase (“asymptotic” size, not how short your function names are or how much whitespace you allow for…).
Here’s an easy example:
We have to “find out” something about the code above to be able to write the code below — in this case that we are filtering something. This insight causes simplification.
Here’s a less obvious one —
To go from the latter to the former involves loss of information, we don’t know when the default is used. We don’t know when reading foo3 if the default is to be considered at all (c is “sometimes” 7).
- Patterns of programming should be judged by their effect on entropy (rather than how familiar they are to use).
- Good patterns may feel limiting at first look — this is natural, you can’t make a mess as easily.
- If left unchecked your codebase will become a “black hole” — new code can be added, but no information can escape.
- Tracking entropy is not trivial, but is crucial. It is related to duplication, entanglement, readability and code size (“asymptotic”, not how short your function names are…).
“Only entropy comes easy” — Anton Chekhov