Book Summary: Growing Object-Oriented Software, Guided by Tests

Cover image
Image credits: Joshua Lawrence

I've recently been lucky enough to receive an O'Reilly Learning membership from work (we're hiring btw 👀). Part of the perks of this, includes access to thousands of books for free. As someone who has always somehow found an excuse not to get through my list of software development related books to read, I decided that this must've been a sign that this needed to change.

As the title of this post suggests, Growing Object-Oriented Software, Guided by Tests by Steve Freeman & Nat Pryce (also known in our community as "the GOOS book"), became my weapon of choice. I cannot quite remember who suggested this book to me, but it was on my list of books to read. At first glance, with the book being around 11 years old, it obviously may come across as slight outdated, but whilst this may be the case for some of the practical examples within the book written in Java using libraries such as JUnit & JMock, as we know in our industry, languages, frameworks & libraries come & go but most of the time the underlying concepts remain roughly the same.

The book seems to be seen as a natural sequel to Kent Beck's book on TDD, Test-Driven Development by Example. The book has a seal of approval from Kent Beck himself, along with other popular names such as Robert Martin (Uncle Bob) & Michael Feathers. Whilst Beck's original book introduced TDD to the world, it did so at a relatively simple level using a money conversion example. It applied what is known as the Classicist approach to TDD.

In the real world though, software is more complex than the money conversion example. We find ourselves dealing with significantly more complex domains & designing classes with more & more dependencies which calls for techniques such as mocking or stubbing. We even sometimes have to deal with legacy code that is highly coupled & difficult to refactor which requires complex data setups before you even think about figuring out how to test the big ball of mud. The book looks at how to deal with these things.

Overall, I think it's one of those books where you cannot completely soak it all in with one read through. It's one of those books that you'll always go back to, regardless of whatever tech stack you're working in. I already knew about concepts such as Acceptance Test-Driven Development, mocking using libraries such as NSubstitute or Moq & things like walking skeletons. However despite a slight disconnect between myself & the Java examples within the book (close enough to C# right?), I found that reading it did a great job in re-affirming my belief in many concepts that I have been taught by more senior developers to myself. In this post, I'm going to share my most important takeaways from the book.

Fellow developers are also a customer

A lot of the time, when we are triggered to write code, it is in response to a User Story/Functional Requirement from a stakeholder/customer of some sort. Whilst our aim should always be to provide this piece of incremental value to the customer (by all means, ship something that works if you're going lean then refactor later), as developers we can sometimes fall in to the trap of forgetting that we ourselves are actually also going to be consumers of that code. Unless there is a necessity not to optimise code for readability (such as meeting a Non-Functional Requirement of some sort or a deadline) & maintainability, then there is no excuse not to do so. Code is read far more than it is written.

There tends to be a direct correlation between team productivity and code readability. The book says "we value code that is easy to maintain over code that is easy to write".

Tests are not just about helping developers refactor existing code or trying to catch bugs. Tests should also be a living documentation of what the software system does, they should be seen as examples of how our software should be used. Tests should be named to describe a feature of our system, including things such as the expected result. Common patterns followed when naming tests include the Given, When, Then as it is not limited to just BDD.

Since a software system's tests should explain the features (which were once requirements) of that system, we should apply the same care towards ensuring the code within our tests are as readable & descriptive as possible.

The book emphasises the importance of the refactor stage of the Red, Green, Refactor TDD cycle & using the Refactor step to include the refactoring of test readability, not limiting ones self to just implementation code refactoring.

Listen to your tests

As soon as you start writing a test, you are declaring the start of a short lived feedback loop for yourself. This feedback loop, guides you as you start your implementation. This is the Red stage of the TDD Red, Green & Refactor cycle. If something is difficult to test, even if you have still not exited the Red stage, you are still getting feedback that there may be something wrong with your design of the thing under test. This is an example as to why many refer to Test Driven Development, not just as a testing strategy but also a software design strategy. If you are struggling, listen to the pain & fix it.

If something is difficult to test, then it will most likely also be difficult to maintain in the future for both yourself & other developers. Often this pain stems from violating certain principles (we'll cover Tell, Don't Ask later), the object under test trying to do too much at once or the item under test having too many dependencies which are difficult to mock out.

If a test is failing, but has a descriptive name & readable contents, then it will point you to the source of your problem & reduce the cycle time to turn it green again.

A web of objects

The book describes object-oriented design as "focusing more on the communication between objects than on the objects themselves". I think anyone who has built relatively complex object orientated software, would agree with this statement. As software grows in complexity & developers optimise for readability (e.g. SRP), the number of classes & dependencies grows.

Assuming a class, instantiated on an object, depends on another class, you end up with what the book describes as a "web of objects" that are all interconnected & dependent on each other. Often these dependencies are defined using interfaces which the book says, define a communication protocol between two objects. For example, we may have a service class that has a dependency on an IRepository interface which defines a contract. We verify that a correct call was made by the service class, to the mock repository within our Unit test. This correct call being made is the communication between objects. It tells us that our service class acts as we expected to, as long as the dependency is satisfied, regardless of what implementation of IRepository we use (our service class will work with any class that implements the IRepository interface).

By using mocks & verifying communication protocols, it helps us make these communication protocols visible within our tests as well as put them under test.

The Tell Don't Ask principle

There are lots of resources on this principle over the internet, so I don't want to go in to too much detail. But it is brought up within the book.

The way that Freeman & Pryce sum it up in the book is: "Ask the question we really want answered (from an object/instance of a class), instead of asking for the information to help us figure out the answer ourselves".

The way that I've always tried to adhere to this principle is by trying to avoid having Anemic Domain Models (even where DDD isn't necessarily strictly the goal). This way I naturally most of the time end up with favouring writing methods on objects that hold data, to do the work with that data. So it makes it more difficult for me to call a getter from elsewhere & create "train wreck" code.

((EditSaveCustomizer) master.getModelIsAble()

The book describes not following this principle as resulting in "train wreck" code because our code can often result in a chain of getters on an object versus a single method call.


This also makes the testing of the class that makes the calls to master easier as we can mock a single method call rather than the return value of each "train wreck" carriage. A nice example can be found here.

Tips to write even better tests

Someone, somewhere, once invented a one assertion per test rule. The book disagrees & says this is not always the right thing to do. Multiple assertions are fine. Someone, somewhere, also once invented a one method per test rule. The book disagrees & says we should be testing behaviors rather than on a method by method basis. Multiple method calls are fine.

Libraries such as JUnit provide ways to define test setup methods or teardown methods. I tend to gravitate towards XUnit in .NET, but I'm by no means an expert. Learn & use these techniques as they will help you increase readability within your test suite.

Use the triple A (Arrange, Act, Assert) or GWT (Given, When, Then) testing patterns within your tests to help separate concerns & increase readability.

Consider an expressive assertions library such as Fluent Assertions or Shoudly to increase readability.

When complex test suite setups are required due to complex objects, resulting in huge constructors everywhere, consider using the Builder Pattern to build your test objects. It will help reduce code duplication... & increase readability.

(Noticed the pattern/emphasis on writing readable code yet?)

Acceptance Test-Driven Development (ATDD) or Double Loop TDD

Unlike within Kent Beck's original book, the GOOS book uses what is known as the Double Loop TDD method or Acceptance TDD. This is where you start off with an Acceptance Test (most likely written using a BDD framework such as cucumber by a stakeholder) to represent the new feature to be built from the outside in, completely black box.

The developer then uses regular TDD in smaller iterations - white box regular developer/unit tests - until enough has been done for the Acceptance Test to pass, confirming we have for sure satisfied the user requirement. For example, a single Acceptance Test might spin up an entire API & make a HTTP request then assert on the output. Internally for this to work, there may be numerous moving parts within the system. A controller, a service class, a repository, including dealing with non-happy paths. These require numerous unit/developer tests. Spinning up an entire API & making a HTTP request is obviously not what we'd do in a regular unit test as this has an overhead. We look at how the book defines the differences between unit, integration & end to end tests in the next section.

Despite what the GOOS book says about Acceptance Tests however, Ian Cooper talks about whether they actually deliver any added value to developer tests assuming TDD has been done correctly in the first place, including the lack of stakeholder interest in writing & maintaining BDD type tests. So I'll leave it up to you to decide!

External, Internal Quality & Types of Testing

The book lays out two types of quality which we look to achieve with tests when engineering software, namely both external & internal quality.

External quality refers to whether our software meets the needs of those outside of the development team. Does our software satisfy the needs of our customers/users? Both functionally & non-functionally?

Internal quality refers to how the software meets the needs of those inside of the development team. How maintainable are the code base & tests? Is the code easy to understand & change? What does the average cycle time look like? How often does it wake you up at 3am because of random exceptions?

Most of the time when it comes to quality, development teams tend to neglect internal quality. This is sometimes down to poor development practices & naiveness... sometimes it is because you work somewhere that gives you no time to go after tech debt, the list goes on. As developers I think the best we can take from the book is that we should care about it & think about how the work we're doing that day, affects the internal quality of the software.

In the bigger picture, it should get you thinking about how you can start metricising & proving poor internal quality to the business, in order to get time to address it.

I don't think the book itself mentions Martin Fowler's testing pyramid, however it talks about the importance of finding the right balance between different types of tests. It talks about the risks of testing too finely grained versus the problems of testing too broadly.

It describes unit tests as being the most useful for providing the development team with feedback on internal quality. Which makes sense. We use them when we're refactoring. If they're difficult to write, it tends to mean our code is highly coupled & we have internal problems that need addressing.

On the other hand, end to end tests provide the most feedback on external quality. Cypress UI tests for example, literally mimic your user.

Integration tests lie somewhere in the middle. The book says that they are very useful when testing against code that we don't own & that they should be used to gain confidence that our system integrates across boundaries it has dependencies on. For example, Ian Cooper talks about how the development community over the years has gotten a little bit confused over what an integration test is & isn't. Testing two classes together, isn't an integration test if they're both code that you own. Testing a class that makes a call to a third party API (across a network boundary) or that queries a SQL database (across a network boundary), is an integration test, because you want that test to give you confidence that your code integrates with the thing you don't own.

The Walking Skeleton

This refers to a barebones application that can be built, tested & deployed in to test & production environments. This obviously also includes things such as infrastructure & appropriate CI/CD tooling all set up to get there. The book talks about starting every new project, with a "walking skeleton".

The book also talks about why "walking skeleton"'s are difficult to do in some businesses that are used to development starting immediately at the start of projects. For example, if a business is used to seeing a certain type of tangible progress (for example, a UI) within the first two sprint reviews of a project running on someone's machine, then they won't be too pleased when you come to them after two sprints with a blank POC application which you think is cool just because it's in a real environment, with real pipelines, integrated to real things rather than stubs.

The issue here is that some businesses are used to having development begin straight away at the start of a project & see it slow down towards the end of a project when problems such as testing or deployment come up. This gives a false sense of confidence (& satisfaction) to the business at the start of the project & the slowness towards the end is normalised when in reality it is highlighting a more important problem. For example, if it's going to take an entire month to get sign off from a bunch of architects or seniors or directors to set up some infrastructure in a production environment, then logically we want to shift left on this & be aware of this as early as possible... rather than a week before a project go live date! It is not only a false sense of confidence to the business, but also to the development team because problems that can crop up, only by trying to put something in a real environment may have also been useful earlier in the development feedback process... to influence development & technical decisions made earlier on which is obviously greatly beneficial.

So the book emphasizes the importance of prioritising your "walking skeleton". Uncover the unknowns early on & be deploying from day one, ensuring your project is smooth sailing until go-live. Do not waste months of development without having confidence in being able to deploy your work & test end-to-end in a real production-like (minimum) environment & then panic at the end!

More gems

Learn to use mocks within your unit tests & why you might want to assert on expectations/communication patterns. Use them where you need to, for example you test-drive new functionality with a test that revolves around a requirement from the outside-in & you then mock out a network layer call or database call. Do not test-drive your code, triggering a new test by creating a new class or method, then end up trying to test individual classes & methods. Focus on testing your requirements/features from the outside-in & you naturally cover any new class or method that gets written in response. Particularly also do not blindly mock all dependencies of a class where you can avoid doing so. This helps you avoid brittle tests. Ian Cooper's talk that I mentioned earlier is a must watch on this topic.

Test-Driven Development is a feedback loop on the quality of our implementation & design. Red, green, refactor. If something is hard to test, then it'll be hard for the next developer to use.

Aim to write the quickest, dirtiest solution to get to green from red. Don't try to write the perfect solution (from your head) straight away unless completely trivial, as you risk over-engineering. Your passing test then gives you a checkpoint to refactor with confidence.

Understand your testing framework & how to use things like test setups, teardowns & parameterising tests. Not mentioned in the book are things like AutoMock & AutoFixture which I can recommend.

The book talks about ports & adapters/hexagonal architecture/clean architecture as being a good way to isolate business domain code from other dependencies like databases & interfaces. You can then define the requirements of your system & how users interact with it through commands (do something, then have side effects) & queries (communicate with something, return the resultant data). Then test these individually as black box.

Don't mock what you don't own or code that you cannot change. Consider writing wrapper classes that encapsulate these scenarios, such as repositories or proxies. Then mock what you do own out as your code is interacting with them, instead.

Enjoyed this blog post? 😄