Why Makes It SCRUM So Hard To Build High-Quality Software?

So we recently started to use SCRUM, and I can’t tell if I love it or hate it. During our first sprints, I noticed something I did not expect: My team and I suddenly felt pressured to deliver low-quality software. That was particularly weird because SCRUM was sold to me as a best practice to develop high-quality software.

A Brief Digression on Software Quality

Before I go into the details of why I believe SCRUM makes it so hard to build high-quality software, let me explain how I understand software quality. There are essentially two types of quality: Internal quality and external quality. In general, both are equally important. However, they are relevant to completely different stakeholders, and, more importantly, their relative importance changes as the software matures.

Internal quality assesses the technical fundament of the software. It describes, how the software is engineered. It measures things like the technical debt, the robustness and completeness of the test suite, and the understandability, extendability, and maintainability (see Clean Code).

External quality, on the other hand, assesses how well the software satisfies the user’s needs. It essentially describes if the software is useful at all. External software may be measured in a net promoter score, user satisfaction, or process-oriented measures like the degree of automation. It matters to the users, the sponsors, and every external stakeholder.

I think it is pretty straightforward to conclude: Without external quality, nothing else matters. Also, nobody cares about internal quality for a one-time proof of concept. It is also much less important at the beginning of a project when we are still in the prototyping phase. However, as the software grows more complex, our user base gets bigger and the stakes are getting higher, and internal quality becomes more important.

And this is where it gets interesting.

SCRUM Encourages Low Internal Software Quality

SCRUM as a management process does not tell you, how to deliver software with high internal quality. It is all about external quality. Every metric and every ritual is designed to up the external quality of the software that is being developed. And since it is insanely effective at doing so, teams are under continuous pressure to jeopardize the internal quality.

I wouldn’t say, that SCRUM is flawed. However, I would argue that SCRUM lacks certain aspects. It leaves the internal quality up to the team (which makes sense, up to a certain degree). And I believe, this can become a problem.

Why Does SCRUM Encourage Low Quality Software?

SCRUM brings concepts and rituals that are very valuable if looked at in isolation. However, if you put them together it might result in low-quality software and short-term thinking.

SCRUM is not “laissez-faire” about commitments and about the sprint goal.
Most of the SCRUM metrics are outcome– or value-oriented from a user perspective.
SCRUM has 1-4 weeks planning horizons and no longer-spanning view.
Prioritisation is done by “the business” or other non-technical persons.

So here’s the point: Everything that SCRUM does focuses on boosting the external quality of your software (i.e. prioritizing and delivering valuable features). SCRUM gives a lot of guidance on how to improve external quality, but little to no guidance on how to improve internal quality.

Teams that do not have an experienced lead engineer who can ensure internal quality, are pressured by SCRUM’s metrics and rituals to focus exclusively on delivering as many features as possible. And think about this for a second: Because they are working in an “industry best practice” and stakeholders are typically very happy with the initial progress and transparency after the first few sprints, there is absolutely no reason why this team should take the time to reduce the features delivered and refactor the application.

Teams that do possess such knowledge, have to actively work against SCRUMs incentivization structure. SCRUM makes every little progress in external quality very visible to everyone while being completely opaque about the internal quality. This makes it very hard to hold back pull requests that do not satisfy the standards for internal quality. This get’s especially hard if stakeholders or product owners are already satisfied with the external quality of the feature.

Software that works locally does not necessarily has the quality for production.

Why does that happen?

Sprint Goal & Commitment

During refinement, we estimate the top stories in our backlog. At this point, this is nothing but a more or less educated guess, typically based on a “reference story”. During the sprint planning, we pick as many stories as we expect we can get done during the sprint. This typically means converting the complexity into my personal estimation of how long I believe it will take to implement this.

So why can this become a problem? Because technical debt is everywhere. It may be the module you wrote 6 months ago. It may be something a Junior put into production while the supervising Senior was on vacation. Or it may be something that simply slipped through review. We are all constantly learning and evolving, and if you believe that your work from over a year ago is still top-notch, you probably did not get any better at what you do.

We will inevitably encounter some sort of technical debt during development that we did not plan for. And this will force us to make a decision: Do we fix that debt or do we hide it (and „elegantly“ work around it)?

Consider this: If we fix the code, we will most certainly miss our commitment. If we miss the commitment, we have to explain ourselves in the Review and the Retrospective. For most people, the decision is mainly based on emotions and the psychological stress it incurs.

Because we plan with partial information and because SCRUM takes its plannings and commitments very seriously, it actively encourages a development team to close their eyes.

Metrics and Transparency

In a nutshell, a story is something that is valuable to the business. Stories are grouped into epics and typically prioritized by a steering committee. So how is a SCRUM team eventually measured and perceived? The number of stories they put through.

Tell me how you measure me, and I tell you how I behave
– Eliyahu M. Goldratt

I do believe, that internal quality should never serve as an end in itself. I always consider it an investment that will cost me something today but will pay out later. So, in order to positively impact the metrics that matter, it is easy to skip on the investment today. “We will deal with that later, once we finished that shiny new feature for business.”

Bear with me here: I’m not saying, that SCRUM forces anybody to behave that way. But psychologically it is much easier to focus on what is being measured. Focussing on something nobody outside of the development team understands and that will negatively impact visible metrics, is much harder.

The real-life constraints we operate under make this even harder: Most of us aren’t well-seasoned experts in a well-functioning team within a well-functioning company. Most of the time, we work on a problem for the first time. As a result, we don’t know the final architecture yet. We are all constantly learning, and we will make mistakes and take detours.

Prioritisation is Done By Non-technical Folks

So why don’t we simply put refactoring “stories” into the backlog and prioritize them then? Because prioritization is rightfully done by non-technical people and should be done based on “business value”. Explaining the business value of a refactoring is futile because it is zero. Refactoring enables us to hopefully deliver more business value in the future by certainly delivering less business value today.

Within the regular SCRUM process, there is no way to get refactoring prioritized separately.

SCRUM Has A Short-sighted Planning Horizon

The combination of these principles is worse enough already. But I believe, that the concept of one- to four-week “Sprints” as the only planning horizon that SCRUM knows of is the actual culprit. SCRUM introduces no high-level or far-sighted planning cycle that embeds a series of sprints. This gives us no guidance in stopping sprinting and start resting.

All focus is on streamlining the backlog by business value and delivering as much of this value as possible in the chosen rhythm. SCRUM makes it insanely hard to break out of that rhythm and fix the technical debt, in order to deliver more sustainable value later.

How Can We Improve Internal Software Quality In SCRUM?

In my previous work experience, Retrospectives or the „Definition of Done“ were not sufficient to address these problems. Because of that, teams need to remove incentives to jeopardize internal quality.

So what do we do then?

Estimate The Complexity, Not The Amount of Work

When we first introduced SCRUM in our team, our agile coach suggested estimating stories simply with time units (as in days to complete) instead of the complexity. I did not object (because I didn’t know better), but today I believe this is the root cause of most problems we encountered. Even though it seems intuitive to quickly glance at something and guess “I can do this in X days”, it causes several problems that I was not aware of:

It brings a time constraint into planning, which is the opposite of how agile is supposed to work. Saying “I can do this in 3 days” essentially sets a target deadline to “this is done 3 days after you start”.
Your 3 days are not equal to my 3 days. Because I have been working on the project much longer, I might be much quicker. Or, because I already know of problems in this module, it might take me longer.
It makes it much harder to justify any extra efforts that come up. If the story is simply worth 8 story points, it doesn’t matter if it took you 3 or 4 days. The amount of story points per sprint is a measured result, not a target you intuitively try to meet.

This is precisely the reason, why you should never do your estimations with time-based units, but use story points instead. Many developers I talked to still believe that story points are basically “a Fibonacci-like expression of the estimated amount of work”. This couldn’t be further from the truth.

Pay Your Respects To The Complexity

As a manager or lead developer, honor the reality that not everybody can solve all the problems. Accept that complex problems require skilled and experienced people. If you staff your development team, consider the most experienced developer as your „upper boundary“ for the quality of your software.

I once did this mistake and paid hard for it when we wanted to implement a message-based system integration. We hired a contractor to implement the adapters because nobody in my team had experience with the desired toolstack. But for some reason, the contractor only put a junior developer on the project. It took me more than 3 months to realize, that this guy wasn’t even capable of understanding, what we were trying to build.

How this happens probably justifies its own blog post, so I cut right to the chase. In the end, I hired a highly skilled senior developer who understood me right off the bat. It took me less than one day to explain the requirements and after less than 2 weeks, I had the complete adapter ready for deployment. He was able to fill in all the gaps and proposed a solution concept that was far superior to what I had envisioned.

In the end, I had literally 10 times the functionality in a fraction of the time. So what’s the takeaway? Experience matters big time. Put at least one experienced developer on a development team. This will massively boost the learning curve and productivity of the junior developers.

Integrate Refactoring in Your Estimation

Plan for the unknown. This is much easier if you already estimate with story points, instead of units of time. But even if you do, there is still wiggle room for a clean solution that involves a little more effort and something on the other end of the spectrum that usually involves a lot of “hacking”. The clean solution might be an 8 instead of a 3, but it will make some upcoming changes you already anticipate much easier to implement.

Actively engage the development team in a discussion about how a certain story should be implemented before they estimate it. Try to design every solution in a way that the implementation actually reduces the technical debt.
Plan with known and existing technical debt. Incorporate the effort to fix that debt in the complexity of the story. Fight off the desire to bypass it.
Lastly, account for technical debt the story could introduce. The hackier or less useful a requirement feels, the higher the complexity of that story should be.

As a rule of thumb, I personally think over-engineering is cheaper than fiddling. Over-engineering almost never costs us speed in the future, it only costs speed today. Fiddling, on the other hand, almost always costs us speed in the future. For a project that runs longer than a couple of months, the marginal gains today never justify the losses tomorrow.

Be Clean

I believe the single most important thing any developer should embrace is the “Boy Scout Principle”. Everyone should treat the code base as his or her responsibility. We should all feel responsible if we encounter something that can be improved. If every developer on the team feels safe to take necessary detours from time to time, overall code quality will continuously increase instead of deteriorating.

Encourage junior developers or newcomers to challenge the existing code base. If a module is hard to understand and even harder to use, this is typically a great opportunity for improving it.
Senior developers on the team should commit to helping inexperienced developers not only understand a certain module, but also make it more robust and easier to use.
In a nutshell: work clean. Don’t ship shit. Always be ready. Continuous improvement. Show fearless competence.

Refactoring should be a fundamental part of your daily routine. Yes, we will touch a running system. Yes, we might break it. And yes, we learn how to not break it in the future (by introducing an automated test case that catches the error for the next one who needs to touch it). Yes, this takes time. But: This builds an environment where everyone can safely learn and that grows more and more robust over time. Refactoring should never be something you do next Sprint, nor should it be something that you ask the business permission for.

Summary

SCRUM is a great process. It encourages continuous learning and describes an easy-to-use interface for the business to control a development team. It introduces concepts like stories and business values and describes rituals to prioritize and deliver effectively. However, the responsibility for implementing that interface lies within the development team, and it is far too easy to get it wrong. In my opinion, SCRUM focuses too much on the business and tells us nothing on practices how to actually become a team that delivers high-quality software (encompassing both quality aspects).

This makes it very hard for teams to keep the internal quality up and prevent them from eventually grinding to a halt after a few months of highly productive sprints. I believe SCRUM puts too much emphasis and guidance on „external software quality“, while not balancing this out with guidance on „internal software quality“. This may lead inexperienced teams to the wrong conclusions and incentivizes unsustainable behavior.