Before you read this, allow me to make one thing clear: SCRUM is awesome. It is so superior compared to how we were previously working, that I never want to go back. However, during introduction and our first sprints, I noticed something very strange and unforeseen: We suddenly felt pressured to deploy low quality software.

I want to investigate why that happened, what mistakes we made and what we did to mitigate these in the future. Hopefully, you can learn from my mistakes.

A Brief Digression on Software Quality

Software essentially has two types of quality: Internal quality and external quality. In general, both are equally important. However, they are relevant to completely different stakeholders, and, more importantly, their relative importance changes as the software progresses through it’s life cycle.

Internal quality assesses the technical fundament of the software. It describes, how the software is engineered. Internal quality is interested in things like the technical debt, the robustness and completeness of the test harness, the general understandability and maintainability (see Clean Code) and the appropriateness of the used design pattern to solve the problems. It matters to the development team.

External quality on the other hand, assesses how well the software satisfies the user’s needs. It essentially describes, if the software is useful at all. External quality is interested in things like a net promoter score, user satisfaction or process-oriented measures like the degree of automation. It matters to the users, the sponsors and every external stakeholder.

I think it is pretty straight-forward to conclude, that without external quality, nothing else matters. Also, nobody cares about internal quality for a a one-time proof of concept. It is also much less important at the beginning of a project, when we are still in the prototyping phase with a small group of trusted key users. However, as the software grows more and more complex, our user base gets bigger and the stakes are getting higher (“people die, if the software fails”), internal quality gets more and more important.

And this is where it gets interesting.

SCRUM Encourages Low Internal Quality

SCRUM as a process or management framework does not care about internal quality. It is all about external quality. Every metric and every ritual I know of, is designed to up the external quality of the software that is being developed. And since it is insanely effective at doing so, teams are continuously pressured to jeopardise the internal quality in order to please the rest of the stakeholders.

Maybe we are all too dumb and just got it all wrong. That could be the case. But I witnessed the exact same development in four different teams at three different companies. I do not believe, that the 80+ people involved (including multiple certified coaches) are stupid. Instead I choose to believe, that “by-the-book”-SCRUM is fundamentally flawed.

Why Does SCRUM Encourage Low Quality Software?

SCRUM encompasses multiple concepts and rituals that are very valuable if looked at in isolation. However, the dynamics that emerge if you put them together will result in low internal quality and short-term thinking.

  • SCRUM is not “laissez-faire” about commitments and about the sprint goal.
  • Most of the SCRUM metrics are outcome– or value-oriented from a user perspective.
  • SCRUM has 1-4 weeks planning horizons and no longer-spanning view.
  • Prioritisation is done by “the business” or other non-technical persons.

So here’s the point: Everything that SCRUM does, focusses on boosting external quality (i.e. prioritising and delivering new features). SCRUM gives a lot of guidance on how to improve external quality, but little to no guidance on how to improve internal quality.

Teams that do not have an experienced lead engineer who can ensure internal quality, are pressured by SCRUM’s metrics and rituals to focus exclusively on delivering as many features as possible. This usually means neglecting internal quality. And think about this for a second: Because they are working in an “industry best practice” and stakeholders are typically very happy with the initial progress and transparency after the first few sprints, there is absolutely no reason why this team should take the time to reduce the features delivered and refactor the application.

Teams that do posses such knowledge, have to actively work against SCRUMs incentivization structure. SCRUM makes every little progress in external quality very visible to everyone, while being completely opaque about the internal quality. Naturally, this makes it very hard to hold back pull requests that do not satisfy the requirements on the internal quality, especially if stakeholders or product owners are already satisfied with the external quality.

Why does that happen?

Sprint Goal & Commitment

During refinement, we estimate the top stories in our backlog. At this point, this is nothing but a more or less educated guess, typically based on a “reference story” and a collaboration vote. During the sprint planning, we pick as many stories as we expect we can get done during the sprint. This typically means converting the complexity into my personal estimation of how long I believe I will take to implement this.

So why can this become a problem? Because technical debt is everywhere. It may be the module you wrote 6 months ago. It may be something a Junior put into production while the supervising Senior was on vacation. It may be something that simply slipped through review. We are all constantly learning and evolving, and if you believe that your work from over a year ago is still top-notch, you probably did not get any better at what you do.

We will inevitable encounter some sort of technical debt during development that we did not plan for. And this will force us to make a decision: Do we fix that debt or do we hide it (and „elegantly“ work around it)?

Think about it for a second: If we fix the code, we will most certainly miss our commitment (if we don’t churn over hours to compensate for the higher effort). If we miss the commitment, we have to explain ourselves in the Review and the Retrospective. Keep in mind: The decision is mainly done based on emotions and the psychological stress it incurs.

Because we plan with partial information and because SCRUM takes it’s plannings and commitments very seriously, it actively encourages a development team to close their eyes.

Metrics and Transparency

All SCRUM teams I know, work with stories. In a nutshell, a story is something that is valuable to the business. Stories are grouped into epics, and, depending on the size of the project, either the epics or the stories individually are prioritised. So how is a SCRUM team eventually measured? The amount of stories they put through.

Tell me how you measure me, and I tell you how I behave

– Eliyahu M. Goldratt

I do believe, that internal quality should never serve as an end in itself. I always feel it as an investment that will pay out later, while costing me something today. So, in order to positively impact the metrics that are actually gauged, it is easy to skip on the investment today. “We will deal with that later, once we finished that shiny new feature for business.”

Bear with me here: I’m not saying, that SCRUM forces anybody to behave that way. But psychologically it is much more easy to focus on what is being measured, than to focus on something nobody outside of the development team understands and that will negatively impact what is visible.

The real-life constraints we operate under make this even harder: Most of us aren’t well-seasoned experts in a well functioning team within a well functioning company that have build whatever we are building a dozen times already. We don’t know yet what will be final architecture that is the perfect balance between long-term investment and short-term delivery. We are all constantly learning, and we will make mistakes and take detours.

Prioritisation is Done By Non-technical Folks

So why don’t we simply put refactoring “stories” into the backlog and prioritise them then? Because prioritisation is done by non-technical people and should be done based on “business value”. Explaining the business value of a refactoring is futile, because it is zero. Refactoring enables us to hopefully deliver more business value in the future by certainly delivering less business value today.

Within the regular SCRUM process, there is no way to get refactoring prioritised separately.

SCRUM Has A Short-sighted Planning Horizon

The combination of these principles is worse enough already. But I believe, that the concept of one- to four-week “Sprints” as the only planning horizon that SCRUM knows of is the actual culprit. SCRUM introduces no high-level or far-sighted planning cycle that embeds a series of sprints. This giving us no guidance in to stop sprinting and start resting.

All focus is on streamlining the backlog by business value and delivering as much of this value as possible in the chosen rhythm. SCRUM makes it insanely hard to break out of that rhythm and fixing technical debt, in order to deliver more sustainable value later.

How Can We Improve Software Quality In SCRUM?

In my previous work experience, Retrospectives where not sufficient to address these problems. Things like the minimum consensus for quality such as the “Definition of Done” was also far from enough.

What a team needs to do, is to actively address that field of tension. This can be done by removing all incentives to jeopardise internal quality in favour of delivered features.

So what do we do then?

Estimate The Complexity, Not The Amount of Work

When we first introduced SCRUM in our team, our agile coach suggested to estimate stories simply with time units (as in: days to complete) instead of the complexity. I did not object (because I didn’t know better), but today I believe this is the root cause of most problems we encountered. Even though it seems intuitive to quickly glance at something and guess “I can do this in X days”, it causes several problems that I was not aware of:

  • It brings a time constraint into planning, which is the opposite of how agile is supposed to work. Saying “I can do this in 3 days” essentially sets a target deadline to “this is done 3 days after you start”.
  • Your 3 days are not equal to my 3 days. Because I have been working on the project much longer, I might be much quicker. Or, because I already know of problems in this module, I might even take longer.
  • It makes it much harder to justify any extra efforts that come up. If the story is simply worth 8 story points, it doesn’t matter if it took you 3 or 4 days. The amount of story points per sprint is merely a measured result, not a target you intuitively try to meet.

This is precisely the reason, why you should never do your estimations with time-based units, but use story points instead. Many developers I talked to still believe that story points are basically “a fibonacci like expression of estimated amount of work”. This couldn’t be further from the truth.

Integrate Refactoring in Your Estimation

Refactoring should be a fundamental part of your daily routine. Yes, we will touch a running system. Yes, we might break it. Yes, we learn how to not break it in the future (by introducing an automated test case that catches the error for the next one who needs to touch it). Yes, this takes time. But: This builds an environment where everyone can safely learn and that grows more and more robust over time. Refactoring should never be something you do next Sprint, nor should it be something that you ask the business permission for.

Translated to estimation and planning, this means: Plan for the unknown. This is much easier if you already estimate with story points, instead of units of time. But even if you do, there is still wiggle room for a clean solution that involves a little more effort and something on the other end of the spectrum that usually involves a lot of “hacking”. The clean solution might be an 8 instead of a 3, but it will make some upcoming changes you already anticipate much easier to implement.

  • Actively engage the development team in a discussion about how a certain story should be implemented before they estimate it. Try to design every solution in a way that the implementation actually reduces the technical debt.
  • Plan with known and existing technical debt. Incorporate the effort to fix that debt in the complexity of the story. Fight off the desire to bypass it.
  • Lastly, account for technical debt the story could introduce. The hackier or less useful a requirement feels, the higher the complexity of that story should be.

As a rule of thumb, I personally think over-engineering is cheaper than fiddling. Over-engineering almost never costs us speed in the future, it only costs speed today. Fiddling, on the other hand, almost always costs us speed in the future. For a project that runs longer than a couple of months, the marginal gains today never justify the losses tomorrow.

Integrate Refactoring in Your Daily Routine

I believe the single most important thing any developer should embrace is the “Boy Scout Principle”. Everyone should treat the code base as his or her responsibility. We should all feel responsible if we encounter something that can be improved. If every developer on the team feels safe to take necessary detours from time to time, overall code quality will continuously increase instead of deteriorate.

  • Encourage junior developers or newcomers to challenge the existing code base. If a module is hard to understand and even harder to use, this is typically a great opportunity for improving it.
  • Senior developers on the team should commit to helping inexperienced developers not only understand a certain module, but also make it more robust and easier to use.


SCRUM is a great process. It encourages continuous learning and implements an easy-to-use interface for the business. The interface allows to control the development team by introducing stories, business value prioritisation and a short delivery rhythm. However, the responsibility for implementing that interface lies within the development team, and it is far too easy to get it wrong. In my opinion, SCRUM focusses too much on the business, and tells us nothing on practices how to actually become a team that delivers high quality software (encompassing both quality aspects).

This makes it very hard for teams to keep internal quality up and prevent them from eventually grinding to an halt after a few months of highly productive sprints. I believe SCRUM puts too much emphasis and guidance on „external quality“, while not balancing this out with guidance on „internal quality“. This may lead inexperienced teams to the wrong conclusions and incentivizes unsustainable behavior.