Almost every company conducts research and development (R&D) to bring innovative products to market. The development of innovative products is inherently risky. As they are products that have never been offered, it is usually unclear whether and to what extent the desired product features can be realized and how exactly they can be realized. The success of R&D therefore depends largely on the acquisition and use of knowledge about the feasibility of product features.

Scrum is by far the most popular form of agile project management today¹. Some of the key features of Scrum are constant transparency of product progress, goal orientation, simple processes, flexible working methods, and efficient communication. Scrum is a flexible framework and contains only a handful of activities and artefacts². Scrum deliberately leaves open how Backlog Items (BLIs) are structured, what the Definition of Done (DoD) of BLIs is, and how exactly the refinement process should work. A Scrum team has to manage these aspects itself. This flexibility is another reason why Scrum is so successful: Scrum is used in various domains, supplemented with domain-specific processes or artefacts³.

Scrum is also well established in R&D⁴. As described above, the success of an R&D Scrum team depends on how efficiently the team acquires and uses knowledge. In Scrum, so-called spikes have become established for knowledge acquisition. Spikes are BLIs in which knowledge about the feasibility and cost of product features is gained without actually realising the features. In this article we want to show what is important when implementing spikes in Scrum and how a Scrum team can ensure that the knowledge gained from spikes is used optimally. We illustrate these good practices with concrete examples from 1.5 years of Scrum in a research project with EnBW.

Spikes
Good Practice: Regular interval of ”spike maintenance” of spikes in the backlog
Good Practice: Lab book
DoR and DoD for Spikes: Hypothesis-first learning
Conclusion

Spikes

A widely used method for gaining knowledge in Scrum is called spikes – BLIs, where the feasibility and effort of product features are assessed without realizing the features. The idea of spikes comes from eXtreme Programming (XP)⁵.

Acquiring knowledge through spikes serves to reduce excessive risk⁶. The proportion of spikes in the backlog should, therefore, be proportional to the current risk in product development. In an R&D Scrum team, there may be many open questions and risks, and spikes may account for more than half of a sprint, as in the Google AdWords Scrum team⁷. However, an excessive proportion of spikes inhibits a team’s productivity because too much time is spent gaining knowledge and too little time is spent implementing the product⁸. Prioritizing spikes over regular BLIs is an important factor for the success of R&D Scrum projects.

Another important factor is the definition of quality criteria for spikes. Specifically, a Scrum team can establish a Definition of Ready (DoR) and a Definition of Done (DoD) for spikes. The DoR defines what criteria a spike must meet to be processed; the DoD defines what criteria a spike must meet to be done. Both definitions influence the quality and quantity of the generated knowledge and its usability in the Scrum process. The DoD is a mandatory part of Scrum. On the other hand, the introduction of a DoR is at the discretion of the Scrum team and is not always useful⁹.

The reason for a spike is almost always an acute and concrete risk in product development. However, the lessons learned from a spike can be valuable beyond the specific reason for the spike and can help inform product development decisions in the long term. To have a long-term effect, the team needs to store the knowledge gained from spikes appropriately.

The use of spikes in R&D projects therefore raises at least three questions:

How are spikes prioritised against each other and against other BLIs?
How are DoR and DoD formulated for spikes?
How is the knowledge gained from spikes stored?

The literature on Scrum leaves these questions largely unanswered. In a multi-year R&D research project with EnBW, we applied various answers to the questions and gained experience with them. We developed two good practices in the process, which we present in this blog post: (1) a regular interval for “spike maintenance,” i.e., for sharpening concrete hypotheses in spikes, and (2) a lightweight lab book for logging findings. We also supplement the two practices with a suitable DoR and DoD.

Good Practice: Regular interval of the „Spike Maintenance“ in the Backlog

Refinement is an ongoing activity of the whole Scrum team. BLIs from the backlog are prepared for processing: BLIs are reformulated, divided into independent BLIs and broken down into specific work steps. Many Scrum teams use regular intervals to refine and gain a common understanding of the backlog.

In our experience, collaborative refinement is a common source of spikes. This is because unresolved issues and risks come to light when the team discusses the backlog. Unresolved risks in a BLI often manifest in the team’s difficulty defining a concrete DoD and the wide variation in team members’ estimates of the effort involved. Specific phrases used by team members may also indicate risks, e.g.:

“I don’t know if it’s possible.”
“I don’t know how to do it yet.”
“I’ll have to find out first.”

Once a risk has been identified, the team must decide whether it is so great that it should be reduced with a spike. The spike is then created in the backlog as a BLI draft. To not exceed the regular refinement deadline, we recommend that newly created spikes are not refined and prioritized in this deadline. Instead, we recommend that the team meet regularly one to two days after the regular refinement to perform “spike maintenance.” In the meantime, spike drafts can be assigned to individual team members to prepare for the spike maintenance interval, e.g., by collecting concrete hypotheses and ideas for experiments.

Of course, spikes can also be created outside of a joint sharpening appointment, e.g., when performing a BLI. Even then, it is a good idea to collect the spikes as drafts and then sharpen them at the next spike maintenance interval. This way, the spike events are not forgotten but do not lead to scope creep in the current sprint.

During the spike maintenance interval, the collected spikes must be refined and prioritized in the backlog. In our experience, most spikes are directly related to one or more BLIs – the BLIs whose implementation poses acute risks. In this case, prioritization is simple: use the existing prioritization of BLIs and add each spike to the list before its associated BLI.

Good Practice: Lab book

We suggest a light lab book to store the knowledge gained from the spikes. The lab notebook documents the following aspects for each spike in 1-2 sentences.

Problem definition: What acute risk in product development needs to be mitigated? What questions need to be answered?
Hypotheses: Formulate hypotheses that are as concrete as possible and that can either be researched or tested experimentally.
Findings: Summary of the experimental or research results.
Consequences: The significance of the findings for the product. E.g. “Product characteristic X can be achieved with Y, but not with Z”.
Relevant links, e.g. to detailed experimental results and BLIs

Completeness and compression are crucial for the long-term added value of the lab notebook. Completeness means that the results of all spikes end up in the lab book; compression means that only the essential results and consequences for the project are briefly and concisely recorded in the lab book. In this quality, the lab book can be used as a discussion reference.

In the project with EnBW, we implemented the lab book as a wiki page in the team’s project wiki. We sorted the entries on the page in descending chronological order, i.e., the knowledge from the current spikes was at the top. We used the lab book for about a year and noticed no scaling problems. The lab book was highly regarded as a source of knowledge within the team and was regularly used in discussions.

DoR und DoD for Spikes: Hypothesis-first learning

This brings us to our final recommendation: a concrete definition of the DoR and DoD for spikes, in line with the two previous good practices.

The DoR specifies which criteria spikes must fulfil before they can be included and worked on in a sprint. As mentioned, the DoR is not part of Scrum, but a commonly used extension. Some Scrum experts are critical of the DoR because it can lead to important BLIs not being included in a Sprint for “aesthetic reasons”¹⁰. We, therefore, propose a compromise: important BLIs are always included in the next sprint, but the person working on the BLI is responsible for ensuring that the BLI fulfills the DoR before it is worked on. This means fulfilling the DoR criteria is the first step of any BLI.

Our DoR proposal for spikes is simple: before a spike is processed, the lab book entry for the spike must be filled in as precisely as possible. In particular, the hypotheses to be investigated must be listed. In the EnBW team, we have had good experience with one team member formulating the hypotheses before working on the spike and then having them critically reviewed by another team member. Once the hypotheses are complete, understandable, and clearly defined, work on the spike can begin.

The lab book entry provides a framework for the spike. This is similar to the well-known Test-Driven Development (TDD) in software development. Unit tests are first used to define what the software should do, and then it is implemented. The creation of a lab book entry before each spike could be described by analogy as hypothesis-driven learning (HDL): first, hypotheses are used to define what is to be learned, then the hypotheses are tested.

The analogy between TDD and HDL is also useful in describing the advantages of HDL. Just as TDD provides a constant incentive to reduce software complexity, HDL provides an incentive to keep experiments simple and focused. And just as the programmed tests in TDD replace much of the written documentation of software, the lab book entries in HDL replace much of the written documentation of knowledge from spikes.

A spike’s DoD is at least as important as its DoR. It determines what criteria must be met for the spike to be completed. The spike’s lab book entry can also determine these criteria. Specifically, we propose that a spike is complete when all hypotheses listed in the lab notebook entry have been confirmed or rejected, the lab notebook entry has been written, and the results have been communicated to the team.

Conclusion

In this blog post we have made a concrete proposal on how to use spikes in R&D Scrum projects. Our proposal enables an R&D Scrum team to assess the feasibility of innovative product features – in time before the product is implemented and with reasonable effort.

At the core of our proposal are two good practices that have proven successful in our own R&D Scrum projects: first, a spike maintenance interval, and second, a lightweight lab book for logging findings. We have shown how both practices can be integrated into Scrum using a simple DoR and DoD.

With the two good practices we fill a gap in the Scrum literature: there are hardly any concrete practices for the creation, prioritisation, quality assurance and long-term use of spikes. The proposed good practices are simple and general enough to be applicable and useful for most R&D Scrum teams. We would be happy if other Scrum teams could be inspired by them.

Author

Moritz Renftle
Data Scientist at scieneers GmbH
moritz.renftle@scieneers.de

Using Scrum to Succeed in Research and Development

Table of Contents