This is an article I initially wrote for myself to prepare a talk I gave at the Polymorphic Meetup in Paris, but after giving the talk I realized it may be a good idea to publish the article itself.
As software engineers we love to write software because it’s a process of creation.
From a simple, abstract idea we make our software grow. We add features, fix issues, make the architecture more beatiful and robust. And when we finally consider it ready, we free our code to the world (as a product, a library or whatever).
As Frederick Brooks says in his “The Mythical Man-Month: Essays on Software Engineering”:
“The programmer, like the poet, works only slightly removed from pure thought-stuff. He builds his castles in the air, from air, creating by exertion of the imagination. Few media of creation are so flexible, so easy to polish and rework, so readily capable of realizing grand conceptual structures.”
When following the development from the start, it’s easier to be confident with the project. The codebase we need to modify is something we saw growning in front of us, it feels like a solid ground every time we need to make a new change.
A small team working on a new projects looks most of the time as an happy island. Everyone’s happy, confident and motivated to grow.
The issues start when reaching the point in which the team needs to grow in order to adress the increasing complexity of the product.
An interesting observation coined by Brooks is that “adding human resources to a late software project makes it later”. (This is known as the Brook’s law)
What? So hiring new developers will make my team slower? Well, yes. But just in the beginning.
This happens fundamentally because
“It takes some time for the people added to a project to become productive … software projects are complex engineering endeavors, and new workers on the project must first become educated about the work that has preceded them”
Being honest, jumping into an existing project is what happens most of the time. Nowadays software fills almost every aspect of our world, joining a team already working on a multi-year project is way more common than starting something from zero.
The code we end up reading and writing on a daily basis, is code which was written or has it roots in a time before our arrival.
Because of this, the software industry evolved in a way that makes it easier to join an existing team.
If you’re lucky enough to start to work for an organization with a proper engineering structure, you’ll expect to find
- Good quality codebase
- Good documentation
- Good tests and CI environment setup
- Good release process
And most important, a good mentorship from the senior members of the team.
Some of the point listed above may be missing in your team, but if you can receive high-quality support from your new teammates, the journey to become an owner of the project is going to be much easier.
Unfortunately in some situations, you may find yourself joining a new project/team without anybody there to guide you through the onboarding.
No matter how skilled or senior you are, the first contact with an unknown codebase it’s hard.
Understanding the architecture and following the execution flow can be overwhelming tasks, and the first changes you end up making to the codebase may have unexpected side-effects on the project.
A common first reaction when facing something we don’t fully understand is:
“Screw it! I’ll re-do it in my own way!”
This has been proven to not be the best idea most of the time.
Many articles have been already written on the topic of re-writing a software from scratch. An interesting one was written almost 20 years ago (🧓🏼) by Joel Spolsky (the co-founder and current CEO of Stack Overflow) but contains some still valid ideas which can help us having a better vision on the topic.
Not understanding the code is not the only reason why we like so much to refactor, Joel explains it pretty well in the article:
We’re programmers. Programmers are, in their hearts, architects, and the first thing they want to do when they get to a site is to bulldoze the place flat and build something grand. We’re not excited by incremental renovation: tinkering, improving, planting flower beds.
And as a follow up
There’s a subtle reason that programmers always want to throw away the code and start over. The reason is that they think the old code is a mess. And here is the interesting observation: they are probably wrong.
Before moving ahead I’d like to share a personal experience on software development that occured me a couple of years ago, when I started to work as a Software Engineer at Delivery Hero.
In January 2017 I joined foodpanda (a food-tech startup in Berlin) as a Junior Software Engineer. Shortly after the company was bought by Delivery Hero and the management decided to merge our engineering department with the one of foodora (one of the brands of the DH group).
The story of the codebase of these 2 startups was already complex before my arrival. Both foodora and foodpanda were born under the same incubator (Rocket Internet) and since foodpanda was founded before (2012), when in 2015 foodora started they “bought” foodpanda’s codebase in order to move faster.
As a result when both companies were bought by DH, the idea was to re-unify the projects and build a single application including the best features of the two.
As for myself I got assigned to the team in charge of the geolocation logic of the apps (web and mobile).
On that part, the core difference between the two platforms was:
- Foodpanda built its own geolocation software
- Foodora was using google maps
Since every tech company dreams to built all the software it needs internally, the first idea was to try to adopt foodpanda’s solution to work on foodora’s application too.
The problem was: foodpanda’s geolocation app was a piece of very complex software, built by a team of skilled and experienced engineers. As it often happens when companies are acquired, after the merge these engineers either left or were moved to managing positions inside the DH organization.
As a result, the newly created location team was facing the mission to own a complex project without the help of the senior devs who created it. The first weeks of worked produced no positive outcome: integrating the internal solution without a proper documentation nor any mentorship looked like a no-go.
At the same time, replacing foodpanda’s code with our google maps integration was not any choice either. Because it was a better product, built to suit perfectly the needs of geolocation in food delivery.
The decision we finally took was to fo with the google maps integration. The company was preparing for an IPO, and deliver a working common app for the two recently acquired companies was a key milestone for DH.
We made it to the expected deadline but we had to up on some market leadership that the internal (but complex) solution was giving the company. This because (citing Spolsky’s article again):
When you throw away code and start from scratch, you are throwing away all that knowledge. All those collected bug fixes. Years of programming work. You are throwing away your market leadership. You are giving a gift of two or three years to your competitors, and believe me, that is a long time in software years.
If you decide to do that, it’s at your own risk. I am not saying that’s completely wrong to take this decision. But you should be sure that the risk is compensated by the potential benefits.
Rewriting or throwing away software for an entire project is anyway often a company-wide decision (here’s a good article on some interesting stories about that)).
As software engineers we are mostly in the position of choosing either to rewrite or not a specific part of the codebase (the project we are working on).
That’s an idea that easily comes to our minds when we’re facing a codebase we don’t know and which we hardly understand.
A little more than one year ago, I joined dailymotion to grow the player team in the Paris headquarter (the player team was historically only located in Sophia-Antipolis).
The part of the project we were asked to take over was the in-video advertising logic. That’s a complex and quite critic piec of software for us: dailymotion makes revenue exclusively through advertising, and every single ad is displayed in the video player.
That means: you break in-video advertising, you get no money.
The first structure of the Paris team was: 2 developers and 1 squad lead. We were all hired at the same time, therefore we all shared a kind of inexistent knowledge of the project. In addition to that, the developer who wrote most of the code related to ads left shortly after.
The first big change we wanted to make was to extract the in-video advertising logic to a separate library. This looked like a reasonable idea as long as the former dev was still at the company. He could walk us through the many modules of the codebase and clarify our minds whenever we encountered something we couldn’t understand.
But, as soon as he left, we found ourselves in a not easy situation. How could we proceed in moving around some critical code that we did not understand for the most part?
Not surprisingly we quickly considered the option of rewriting the ad stack as a separate library. While keeping support for the old codebase in terms of bug fixing.
This idea was thrown away after short time. Mainly for 2 reasons:
- Bug fixing was taking a lot of time
- The new library development was going very slow (and the planned release date was being moved further and further)
We talked to our management in order to raise these problems and they proposed us to speed up the recruiting process in order to get more developers working with us as soon as possible.
Unfortunately, as I quoted in the beginning of this article, “adding human resources to a late software project makes it later”.
The team had to admit that the goal of decoupling the code into a library was not a project we were capable of tackling at that point in time. And there were not real shortcuts we could take, if not by a high risk (less bug fixing, feature cutting).
The only thing we could do was to put the big project aside for a while, and focus on learing the project little by little. This required months of studying and sprints spent simply fixing a couple of tiny tiny features.
That may sound not efficient, but there’s no short path to real code ownership.
Another good quote from Spolsky’s article is:
The reason that they [the developers] think the old code is a mess is because of a cardinal, fundamental law of programming: It’s harder to read code than to write it.
And reading code takes time.
Our experience came finally in hand when the team started to grow. Being aware of how much complex a project is, can be the first achievement in order to be a good mentor for the new joiners.
We tried to put as little pressure as possible on the people who joined the team. Forcing them to deeply read and understanding the codebase before getting invloved in big tasks.
We try to write more documentation now, both in the code and in our wikis. And to push for many pair-programming sessions between seniors and juniors.
So how can we “own” a project?
- Start small: don’t try to understand everything at first, focus on a specific part of the codebase
- Understand the product: a codebase is not just code. It’s a the programmer representation of a product. Start by understanding what your software is doing (or what is supposed to do 😉) before jumping into the code
- Read the docs: if some documentation exists, take the time to read it. Knowing a small assumption in the code architecture can often save you hours of headache. Docs usually provide you information on why some “hackish” solutions was chosen in the past.
- Read the code: there’s no shortcut for that. If you’re goal is to maintain and enrich an existing codebase, you’ll better be prepared to read more code than write.
- Understand the code: this point will probably force you to exit from your personal “developer comfort zone”. Understanding constructs and patters that you’re not used to use or that you did not know can be hard.
- Undesrtand the flow: try to follow how runtime execution works, how the data flows through your codebase and how components communicate between each other. Drawing diagrams can usually help for this point
- Write the docs: don’t be selfish! Once you made the effort to understand a part of the project, think about the future engineers who will be in the same situation as yours! Take the chance to improve the existing docs, or write new ones if none are existing.
It may look like I’m suggesting to give up on refactoring code and that the only way to own a project is to accept and fully understand what the people before us decided to develop.
Well, that’s not exactly true. Code refactoring is good and it’s a natural step in project ownership. If the code is weak or hard to understand, refactoring is the thing to do!
- Make the code yours: reorganize the project architecture, improve the used patterns, make the code more robust.
What I want to show with this article is that, usually, starting with the step make the code yours can be dangerous.
Not only in terms of bugs increase or features development; you may be losing a chance to learn a lot. Everytime we are introduced to a new project, we can absorbe the experience of whom has worked on it before us.
Erasing everything without trying to deeply understand what was done should be done only if you’re an amazing Software Engineer. And are you sure tou are one? (yet 😉)