1y ago

Linear code is more readable

blog.separateconcerns.com Linear code is more readable

Code linearity is more desireable than keeping abstraction layers separate.

46 comments

The code on the left is more readable. It is easy to follow and see what is happening at each step.
That being said, the code on the right is easier to maintain. If the requirement comes down the pipe that we now need to support a new pizza topping it is easy to jump to the add toppings method using the IDE instead of scanning the entire monolith procedural function. Or if we now need to add support for take-and-bake, we can just modify the small entry method.
This also assumes that we are not needing to reuse any of these methods. If you want to add toppings to a sandwich or a salad, better write another huge method like the one on the left, or add a ton of nested if/else or switch statements. If you use the style on the right you can reuse the add toppings method without worrying about if you should preheat the oven for a salad.
The author chose a very simplistic requirement for an example and it is all well and good until you let it fester in a code base for ten years, with multiple interns maintaining it and end up with a huge spaghetti code monster to deal with.
- That being said, the code on the right is easier to maintain
  That depends if you guessed how it is going to change correctly. All too often you don't and some weird requirement comes in that your abstraction does not account for and makes the whole thing really awkward. Which tends to lead to spaghetti abstractions that are just as bad to maintain as spaghetti code.
  For the context of the code he has given the code on the left is easier to maintain. At least at the start. Once it becomes more complex and more requirements are added and start to make it less maintainable that is the point abstractions should be added. Linear code is far easier to find and add abstractions to that already highly abstracted code.
  The problem most of these examples and counter examples make is only showing simple code and assuming that you always want to apply the patterns of abstracting things or not. In both cases the code becomes a mess over time as too many abstractions are equally as bad as not enough abstractions. And where those lines are depends on the code you have in front of you. There are no good rules out there ATM for all cases. Just some that sometimes work, and others that work in different situations. Better to learn all of them and when they are most useful to apply. And don't be afraid to rip out some bad abstraction (that may have once made sense).
  Personally I would do something in the middle of both those solutions. Why is the oven preheat not a method on the oven object? Makes the overall method simpler and still linear. No need to abstract every part of the function into methods, but some do many good candidates for that.
  
  The problem most of these examples and counter examples make is only showing simple code and assuming that you always want to apply the patterns of abstracting things or not.
  This is the real problem. Without context of what the project is for we can only speculate on what the "best practice" is. If my problem is that I have a directory with 2000 videos in it, and I need to process all the ones with an English language track, I am going to write a one-off bash script, and not a huge C# project filled with the OO concepts.
  But if the method is one of 10,000 needed in a huge project, then sticking with the coding guidelines of the whole project is more important for maintainability. A dev coming in 36 months later who is familiar with the code base would have less problems going through an abstracted setup, just because they have experience with the project and can make assumptions from that.
- The right is also easier to write tests for, which is crucially important to me.
- Do not solve maintenance problems that you don’t face.
  
  On the other hand, not anticipating and preparing for eventualities can be just as bad.
  Nothing quite so permanent as a temporary solution, especially those difficult to maintain.

The left side (linear) looks like the code I write while I'm trying to figure out whether I understand the problem or I'm not quite sure what all I need to do prove that I can (or cannot!) solve the problem.
The code on the right, with all the "abstractions" looks like the code I end up with after I've got everything sorted out and want to make things easier to read, find, and maintain.
I actually find the code on the right easier to read. I treat it like a well written technical manual. For me, understanding starts with time spent on the table of contents just to get a sense of the domain. That is followed by reading the opening and closing sections of each chapter, and finally digging into the details. I stop at any level that meets my needs, including just picking and choosing based on the table of contents if there is no clear need to understand the entire domain in order to accomplish my goal.
I've preferred code written in that "abstracted" style since discovering the joy of GOSUB and JSR on my VIC-20.
- Exactly, the code on the right let us read only what matters when we're trying to solve a problem or understand it. The left one makes us read the whole thing and treats code like prose instead of reference material.
  
  And this is actually important when doing your job. I was reading code just yesterday written like the "left side" and it slowed me down because I was forced to understand everything that was happening in a big paragraph instead of just glossing over a function with an understandable name. These "inline functions" will often introduce temporary variables and stuff that forces the reader to understand the scope of things that really don't matter at the current level of abstraction.

To me, the one of the right is FAR more readable. At least if your intention is to understand what it does.
For the left-hand one, you have to read all of it to understand its flow. For the right-hand one, just the first function is enough. You now know what it's doing, and how (roughly) it's doing it. Each subsequent function then explains a step, in detail. This allows you to focus, say, on cooking, without any understanding of the preparation phase, or the boxing phase. This vastly reduces the cognitive load, and lets me focus in on the small details.
This same set of properties is also what makes the right hand one easier to maintain. It lets you treat parts as black boxes, you don't need to know, or care, what goes on inside. You can work on 1 with minimal references to other parts. E.g. the oven could be a simple warm up, cook, cool down cycle. However it could also be a complex, machine learning driven system, governing 100s of concurrent ovens and 10,000s of pizza for optimal cooking speed, energy use and maintenance requirements. Outside of the function, I don't need to care. It's just "Give me an oven, wait, here's your cooked pizza".

I get the point the author is coming from. When I was teaching first year engineering students programming, the one on the left is how everyone would write, it's simply how human intuitively think about a process.
However, the one on the right feels more robust to me. For non trivial processes with multiple branches, it can ugly real quick if you haven't isolated functionalities into smaller functions. The issue is never when you are first writing the function, but when you're debugging or coming back to make changes.
What if you're told the new Italian chef wants to have 15 different toppings, not just 2. He also got 3 new steps that must be done to prepare the dough before you can bake the pizza, and the heat of the oven will now depend on the different dough used. My first instinct if my code was the one on the left, would be to refactor it to make room for the new functionality. With the one on the right, the framework is already set and you can easily add new functions for preparing the dough and make a few changes to addToppings() and bake()
If I feel too lazy to write "proper" code and just make one big function for a process, I often end up regretting it and refactoring it into smaller, more manageable functions once I get back to the project the next day. It's simply easier to wrap your head around
heatOven()

bakePizza() box() than reading the entire function and look for comments to mark each important step. The pizza got burned? Better take a look at `bakePizza()` then.
- If the chef wants 15 toppings, then you start abstracting it. There's no point in overengineering.
  
  Might as well start with a solid foundation from the start though. The extra work is minimal so there isn't much of a time cost to it. I wouldn't call it overengineering, it's just a different way to write code, and the way many naturally default to without really thinking about it.

The right example is poorly executed. The left example is fine, but has one crucial deficiency: it's not very modular, which makes it difficult to test and scale. Big problems need to be broken down and managed in discrete steps, and this is also true of computer code.
The left example is like running a pizza shop where you explain all the steps to everyone and then let everyone loose at the same time to make a pizza. The right example is like creating stations and delegating specific responsabilities to one person at a time.
The former creates redundancy and is manageable at small scale. But as you grow, you find that the added redundancy is of no additional value, while you end up with chaos, as people argue and fight over the process.
Can you imagine five developers working on the monolithic pizza code all at the same time? Total chaos. Better to have one developper assigned to baking, another assigned to prep, etc.
- On the contrary, I think that the left piece of code is not building constrains prematurely and actually enables you to modularize it away when needed.
  Sure, if the logic grows, if it needs to scale, if the team increases in size... then it makes sense to modularize it. But building something from the very beginning to achieve that is going to impose constraints that make it harder to reason about and harder to refactor; you'll have to break down the previous structures and boundaries built by the function heavy example, which will probably introduce needless indirections.

This is exactly the point made in this 2010 article that's probably one of the things I link to most often in online discussions.
https://whatheco.de/2010/12/07/function-hell/
Also, the real problem with code on the right in OP is all the state mutations made to the arguments passed into the functions.
Not too familiar with golang, but this really could use some sort of builder pattern or something, if you're going to mutate the object. Or a more functional style with just a pure function mapping an input "order" to output "toppings", etc.
There are plenty of ways to make the bad design on the right better without stuffing everything into a single function, so the whole premise is a bit of a false dichotomy.

Knowing what and when to abstract can be hard to define precisely. Over abstraction has a cost. So does under abstraction. I have seen, writen and refactored terrible examples of both. Anecdotally, flattening an over abstracted hierarchy feels like less work and usually has better test coverage to validate correctness after refactoring than abstracting under abstracted code (spaghetti code, linear code, brand it how you will). Be aware of both extremes and try to find the balance.

People have already mentioned testing and abstraction, but what about other developers and security?
Spaghetti code all you like in solo projects. But if someone else is coming along to debug a problem in their toppings, why would you make them remember anything about baking or the box when it's completely irrelevant?
And why should the Box object be able to access anything about the Oven's functionality or properties? Enjoy your oven fire and spam orders when someone works out they can trigger the bake function or access an Order's payment details from a security hole in the Box object implementation.
It's not just about readability as a narrative, even if that feels intuitive. It's also about memory management, collaboration and security.

The author’s primary gripe, IMHO, has legs: the question about the oven’s relationship to baking is buried as part of bake() and is weird. But the solution here is not the left-hand code, but rather to port some good, old-fashioned OOP patterns: dependency injection. Pass a reference to an Oven in createPizza() and the problem is solved.
Doing so also addresses the other concern about whether an Oven should be a singleton (yes, that’s good for a reality-oriented contrived code sample) or manufactured new for each pizza (yes, that’s good for a cloud runtime where each shard/node/core will need to factory its own Oven). The logic for cloud-oven (maybe like ghost kitchens?) or singleton-oven is settled outside of the narrative frame of createPizza(). Again, the joy of dependency injection.
To their other point, shouldn’t the internals of preheating be enclosed in the oven’s logic…why yes that’s probably the case as well. And so, for a second time, this code seems to recommend OOP. In Sandi Metz style OOP in Ruby (or pretty much any other OOP language) this would be beautiful and rational. Heck, if the question of to preheat or no is sufficiently complex, then that logic can itself be made a class.
As I write, I thought: “How is golang so bad at abstraction?” I’m not sure that that is the case, but as a writer of engineering education, I think the examples chosen by the Google Testing Blog don’t serve well. Real-world examples work really well with OOP languages, fast execution and “systems thinking” examples work great with golang or C. Perhaps the real problem here is that the contrived example caters to showing off the strengths of OOP, but uses a procedural/imperative-style-loving language. Perhaps the Testing Blog folks assumed that everyone was on-board with the “small factored methods approach is best” as an article of faith and could accept the modeled domain as a hand wave to whatever idea it was they were presenting.
- I think the crux of good software design, is having a good feeling/intuition when to abstract and compose, and when to "inline" code as it's used only once anyway (in which case "linear code" is IMHO often the better choice, since you're staying close to the logic all the time without having to jump between files etc.).
  I agree the examples of the google blog are bad, they don't really show how to compose code nicely and in a logical way (whether that would be more data-oriented, imperative or OOP).

I put all my code in a single file and use goto statements just to make people mad
- Damn, I have a lot to learn as a junior dev
- Personally, I prefer to do the opposite and break things up until my import is longer than my code

Left side doesn't look too bad until you realize that most people (including your future self) reading this file later will just be interested in the first 5 lines of the right side. They won't care about all the details of how the pizza was made, and the left side has too big of a scope to quickly glance it and go to the part that matters.
Not to mention reuse of functions and maintainability. Right side every time.
- I'd argue you're right until you need to track down a bug in the code. Then, to the author's point, you have to jump back and forth in the code to figure out all the interdependecies between the methods, and whether a method got overridden somewhere? What else calls this method that I might break by fixing the bug? (Keep in mind this example fits on one screen - which is not usually the case.)
  
  all these debugging problems are still better to solve than if all the code was in the same scope like on the left side. It's not worth exchanging possible overriding or data interdependency for a larger scope.

Both styles have advantages and disadvantages. Fully procedural code actually breaks down in readability after a certain length, some poeple suggest 100 or maybe 200 lines, depending on how much is going on in the function.
Blanket maxims tend to to have large spaces where they don't apply.
Additionally, the place where the code on the right is more likely to cause bugs and maintainability issues is the mutation of the pizza argument in the functions. Argument mutation is important for execution time and memory performance, but is also a strong source of bugs, and should be considered carefully in each situation. We don't know what the requirements for this code are, but in general we should recomend against universal use of argument mutation (and mutability in general).

I've worked in a company that used linear code most of the time. And at first it felt really easy to read and work with. If you wanted to know what happened, just jump to the entry point, then read over the next 200 lines of code, done. No events, no jumping around between 10 different interfaces, it worked at first.
But over time it became a total mess. A new feature gets added, now those 200 lines get another if/else at several spots, turns into 250 lines. Then a new option is added that needs to be used for several spots, 300 lines. 400 lines. 500 lines.. things just escalate.
You can't test that function and bugs sneak in far too easily. If you want to split it up later into new functions it's going to be a major hassle. There also was no dependency injection or using interfaces, other classes were often directly called or instantiated on the spot. Code reuse was spotty and the reused functions often got 5+ parameters for different behavior.
It was horror after a while.
The company I work for now uses interfaces, dependency injection, unit tests, but all the way down a function might still have 50 lines tops or so. It's slightly tougher to find out where things happen, but then much easier to work with. You need a certain balance either way.

I always right my code linearly like on the left example with comments like further in the articles. Actually what I do if I right all the comments first and then add the code. If I push my code like that everyone immediately understand my code find bugs & potentiel issues with it and then tells me to refactor it in whatever flavor of best practice they like. If I structure it like on the right reviewers still complain about the structure I choose but never identify any bug or other real issues.
All my career everyone would say elegance and cleverness are bad but everyone who gets promoted are the one who insist on elegant and clever code. I guess it's because their confident and vocal and that's what human are programmed to pick as leaders

One downside with the code on the right is that it's not obvious where the different functions might be called from. In the example on the left, we know that we're not, say, adding toppings to a pizza that we've already baked. If we notice a bug in the topping adding function on the right, we might get tempted to reason about adding toppings as a general process that needs to handle all kinds of situations when in practice is doesn't.
When you have single use functions like this it's good to limit the scope of the function using whatever language features are available so that you can more easily reason about where it's being called from
- One way to make it obvious which function can be called at which state is to use different type. Like UnbackedPizza and CookedPizza, and the bake function takes the former and returns the later.

I sit somewhere tangential on this - I think Bret Victor's thoughts are valid here, or my interpretation of them - that we need to start revisiting our tooling. Our IDEs should be doing a lot more heavy lifting to suit our needs and reduce the amount of cognitive load that's better suited for the computer anyways. I get it's not as valid here as other use cases, but there's some room for improvements.
Having it in separate functions is more testable and maintainable and more readable when we're thinking about control flow. Sometimes we want to look at a function and understand the nuts and bolts and sometimes we just want to know the overall flow. Why can't we swap between views and inline the functions in our IDE when we want to see the full flow? In fact, why can't we see the function inline but with the parameter variables replaced by passed values to get a feel for how the function will flow and compute what can be easily computed (assuming no global state)?
I could be completely off base, but more and more recently - especially after years of teaching introductory programming - I'm leaning more toward the idea that our IDEs should be doubling down on taking advantage of language features, live computation, and co-operating with our coding style... and not just OOP. I'd love to hear some places that I might be overlooking. Maybe this is all a moot point, but I think code design and tooling should go hand in hand.

What worries me is that neither version handles any errors whatsoever.
What happens if the oven never gets hot, because the gas interface has been brought down for maintenance? Now we've allocated a raw pizza that will never be baked. Since we never time out or check errors here, eventually the customer will time out waiting for a pizza and switch to our competitor.
Are we really allocating a new oven for each pizza? Probably not; oven may be a singleton. In one case, it's possible for the oven to fill up with pizzas and oven.Insert to fail; in the other case, it's possible for the kitchen to fill up with ovens, and oven.New to fail. Lacking error checking, we're eventually going to get one or another kind of oven overflow.
What happens if order.kind is "fuck you"? We don't put any toppings on the pizza; but (in code not shown) the value of order.kind probably still gets printed on the receipt. So some poor schmuck gets delivered a none pizza with fuck you. Like so many other Internet services, our pizza service can be exploited for harassment.

Left is clearly more readable, after you read comments. I think the example is bad though, and in most cases right would be clearer

The code on the right is better (though perhaps taken to an extreme) and what it comes down to is the code on the right makes you think in terms of the layers of the function when you make a change.
Linear functions provide very little friction to changes that (when you see the interaction through all the layers) are actually quite bad. The code on the left will -- without extreme discipline -- become spaghetti as the years pass. Even with extreme discipline as the lines of code in the function grow, and other comments are added, the actual flow of the function will be harder to see and harder to grok because it can't all be put up on the screen at the same time.
It's ultimately the same idea as minimizing side state/side effects. You want a bunch of small pieces that can be independently verified as doing their job, rather than one large thing that you can't understand at a glance.
- You can always extract out the linear code when it actually becomes useful. It's much easier to refactor when needed.
  
  I disagree, by the time you find you "need to" it's often already tangled.

I generally prefer a mix of the two, you have a chain of linear logic that pulls out clean chunks into methods when they get two complex or need repetition/recursion etc. I would rarely have a method that is just a list of function calls.
As with everything, there isn't a one size fits all ideal solution, it depends on the exact code.

Learner here. I don't like nesting single use functions. The moment I follow a function and it's just another abstraction for more functions I start feeling dread. Also I can't help but feel bad for cluttering the name space by breaking out single use functions in general, even if they're private. Seeing the sentiment here, guess I gotta get used to it
- Think of it more like bigger building blocks rather than single use functions. If there is an issue with the pizza arriving burnt black at the customer you don't want to read through the logic for making the dough and adding toppings if the most likely cause is the oven.
  Sure, you could add comment blocks to mark the sections. But you can't easily jump to that exact point. With function names you can easily skim over the unimportant calls, or go through a list of functions/methods in the file and jump there directly. With comments that's not a standard feature in IDEs.
  Also that function does not scale well if you have more than 2 options of toppings. Maybe some are not compatible and logic needs to be added that you don't use pineapple and banana on the same pizza, for example.
  But I understand your argument about following through multiple layers of abstractions. That's something that irritates me as well a bit, if you follow a function, that does nothing, but pass the same parameters through another function and return the result.
  No guard clauses, or changes to the data, just 1:1 pass-through. And this them goes 2-3 levels deep before reaching real code again. Bonus of they all are in different files too.
- I don’t like nesting single use functions.
  At a certain point this is necessary due to overall length. You don't want a single function that is hundreds of lines long - they suck to debug and to test. Single-use functions break that up into logical chunks that make it easier to understand.
  The moment I follow a function and it’s just another abstraction for more functions I start feeling dread.
  This can actually be ideal in many cases due to the Single-responsibility Principle. Think of the purpose of those functions as coordinating the workflow of the other functions.

46 comments