Systems Thinking Primer
I have written a series of articles that apply systems thinking to explore various organizational design options and change strategies. The goal of this primer is to prepare you for understanding and practicing systems thinking. Besides describing basic concepts, we shall use examples in product development organization to help see the significance of systems thinking in the context of large-scale product development organization. Meanwhile I shall share insights through my own practicing.
1. Systems Thinking
Why do we choose to apply systems thinking to explore and guide organizational design and change? Peter Senge in his book "The Fifth Discipline: The Art & Practice of The Learning Organization" makes the distinction between detail complexity and dynamic complexity. Detail complexity is characterized as having many variables, while dynamic complexity is characterized as the subtlety between cause and effect. Cause and effect may not happen in the same time and space, which brings high dynamic complexity. Systems thinking can help us better understand such problems thus enable more effective intervention. The design and change of large-scale product development organizations not only have detail complexity, but also dynamic complexity. Thus, systems thinking is a good fit there.
Many definitions of systems thinking can be found by various authors. The concepts and tools I use in my articles are borrowed from system dynamics, founded by professor Jay Forrester from MIT. There are four main tools: Causal Loop Digram (CLD), Behavior over Time Diagram (BoT), Stock & Flow Diagram (SFD) and Computer Simulation. Many insights from systems thinking are counter-intuitive. Therefore, quantitative analysis based on computer simulation enables us to generate new insights and further change our mental models. However, considering that 1) approaches to make quantitative analysis through mathematical modeling for the topic of organizational design and change are not mature, and 2) qualitative analysis and critical thinking based on CLD are already helpful in enabling us to change our mental models (i.e. shift the thinking of cause and effect from linear to loop-based), I shall use CLD as the main tool for system modeling and analysis in this primer.
2. Causal Loop Diagram
You can find introductions to CLD in any general systems thinking book, e.g. "Seeing the Forest for the Trees: A Manager's Guide to Applying Systems Thinking" or "Systems Thinking Basics: From Concepts to Causal Loops". Its basic elements include: variable, link and loop. Let's introduce them one by one.
Variable
Variable is a factor in the system structure we are trying to model. Its value changes over time. In product development organization, common variables include: number of people, amount of requirements, cycle time for delivery, velocity, flexibility, quality, value, morale, satisfaction, etc.
When defining variables, it is worth mentioning the following:
- Variable is noun. For example, "speed up" is not a variable, and the related variable could be "speed", which may increase or decrease.
- Variable could be tangible or intangible. For example, the variable "number of people" is tangible, while the variable "morale" is intangible.
- Variable could be decomposed into more granular ones. For example, the variable "amount of requirements" could be further defined into "number of requirements" and "size of requirements".
Link
Link is the causal link between variables. The link from variable A to variable B means, assuming all other factors being equal, a change in A will lead to a change in B. According to the relative change direction, link can have two types of polarity. It could be positive, noted as "+"; or it could be negative, noted as "-". There may also be a delay in associated change, noted as "||".
Let me illustrate with two examples.
1. The link between "amount of requirements" and "number of people"
Does bigger "amount of requirements" cause higher or lower "number of people"? or no relation at all (i.e. no link between them)?
One explanation is, the bigger the "amount of requirements", the higher the "number of people", because we would need to hire more people to work on requirements. Thus a positive link forms between the "amount of requirements" and the "number of people".
However, with this logic we find that "number of people" as a variable is not very accurate. It could mean "number of required people" or "number of available people". We can make a distinction between them in the below diagram to show the underlying causation more clearly. Note that "hiring" itself is not a variable, but "hiring intensity" could be more appropriate, as it may increase or decrease.
Bigger "amount of requirements" leads to higher "number of required people" which leads to increase in "hiring intensity" and increase in "number of available people".
Are there any delays in the above three links? The delay itself is a relative concept. There is a continuum between no delay and infinite delay. In qualitative analysis, we make a judgement on how much this delay influences the dynamic and then we decide whether to include it. For example, considering that there is usually a significant delay from increasing "hiring intensity" to having higher "number of available people", we decide to explicitly present the delay in the link.
Bigger "amount of requirements" leads to higher "number of required people" which leads to increase in "hiring intensity" and, after some time, increase in "number of available people".
Besides positive/negative noted as "+"/"-", another common notation for the change direction in a link is same/opposite noted as "s"/"o". I decided to use positive/negative notation for the following reason.
Increasing "hiring intensity" leads to higher "number of available people"; the change is in the same direction. However, does decreasing "hiring intensity" mean lower "number of available people"? As long as we are hiring, the number of available people will in fact never decrease. The two variables don't seem to move in the same direction. Thus, the definition of same/opposite direction can be very confusing. CLD does not distinguish between stocks and flows, while SFD does. What is the exact meaning of positive/negative link polarity? The more accurate statement would be lower "hiring intensity" causes the "number of available people" to become lower than it would be otherwise. Therefore, regardless of the change direction, it is always positive.
2. The link between "amount of overtime" and "output"
Is there any causal link between "amount of overtime" and "output"? Is it positive or negative?
One explanation is, bigger "amount of overtime" leads to bigger "work effort" which leads to more "output". The overall link polarity is positive.
Another explanation is, bigger "amount of overtime" leads to lower "morale" which leads to lower "efficiency" and less "output". There is a negative link between the "amount of overtime" and the "morale", which makes the overall link polarity negative too. In order to judge the overall link polarity, we only need to count the number of negative links in between. If the number is even, the overall polarity is positive; if the number is odd, the overall polarity is negative.
So, there are two links between the "amount of overtime" and the "output", one is positive and another is negative. Whether the total effect is positive or negative has to be decided through quantitative analysis.
Moreover, there is also a delay between the "amount of overtime" and the "morale". It takes a while for overtime to decrease the morale. Bigger "amount of overtime" after some time leads to lower "morale", then lower "efficiency" and less "output". Delay does not change the link polarity, but we shall see its significance later.
Loop
Several links can form a loop. If there is a link from A to B, and there are other links via a series of other variables from B to A, a loop forms. There are two types of loops: one is a reinforcing loop, noted as R; and the other is a balancing loop, noted as B.
1. Reinforcing loop (R)
This is a common dynamic around a product. An increase in "value" leads to bigger "sales" which leads to higher "profit" and increased "investment" and an even higher "value". This forms a reinforcing loop.
It is worth noting that I read the above as a virtuous cycle. However, it can also be read in another direction. A decrease in "value" leads to smaller "sales" which leads to lower "profit" and even lower "value". This is also a reinforcing loop, but it is a vicious cycle. So, a reinforcing loop can be both virtuous and vicious.
When we create a loop, naming it is an important thinking step. A loop is only a part of a bigger dynamic. By naming the loop, we are able to explore the overall dynamic at a higher level. Take the example of the reinforcing loop above. We can name it as "value drives product growth".
2. Balancing loop (B)
Continuing from our previous example on "amount of requirements" and "number of available people", we discover that our definition of "amount of requirements" is not accurate either. It could be further defined as three related but different variables: "amount of done requirements", "amount of to-do requirements", and "amount of input requirements". The "amount of to-do requirements" is the difference between the "amount of input requirements" and the "amount of done requirements".
The bigger the "amount of to-do requirements", the higher the "number of required people", the "hiring intensity" also goes up as well as the "number of available people" which leads to bigger "amount of done requirements" and smaller "amount of to-do requirements". This forms a balancing loop. We can name it as "hiring people to get requirements done".
We get familiar with vicious and virtuous cycles through our life experience. It is not so hard to understand reinforcing loops. In contrast, it is harder to understand balancing loops. A characteristic of a balancing loop is goal-seeking. We can define a problem as a gap between a goal and current reality. The gap drives changes in behavior which causes changes in current reality and the gap to be reduced.
The classical example for explaining a balancing loop is the adjustment of water temperature. You have a "target temperature" and an "actual temperature". When there is a gap between them, it drives you to increase the "level of adjustment" leading to an increase in the "actual temperature" and reduction of the "temperature gap". You gradually slow down the adjustment eventually reaching the "target temperature". This becomes the balance state. This is similar to the balancing loop of "hiring people to get requirements done". The "amount of to-do requirements" is the equivalent of the "temperature gap" and hiring corresponds to adjusting water temperature.
Let us look at another example of the balancing loop.
On the left diagram, the bigger the "pressure", the higher the "number of shortcuts" taken, the faster "speed", the smaller "pressure". Shortcuts mean all kinds of approaches to help speed up delivery in the short term through sacrificing quality, e.g. do less testing, copy and paste, ignore error handling, etc. We can name it as "taking shortcuts to speed up". Where is the "temperature gap" here? Pressure! "pressure" is actually caused by the gap between some "target speed" and the "actual speed". Taking shortcuts is equivalent to adjusting water temperature. On the right diagram we illustrate it in the same way as in the example of adjusting water temperature. They are essentially the same diagram.
To detect whether a loop is balancing or reinforcing, beside going through each variable and link, there is a simpler way - count the number of negative links. If it is even, the loop is reinforcing. If it is odd, the loop is balancing. The step of naming the loop serves as a verification. When you find that the name for a reinforcing loop is solving a problem/reaching a goal, or the name of a balancing loop is growing/declining, you should examine the loop again.
3. System Archetype
Multiple loops interact to form certain patterns. In systems thinking, those patterns are called system archetypes. There are roughly a dozen common archetypes, and some of them appear often in my articles. In the following, we shall make a brief introduction by using examples from product development.
Fixes that backfire
"Fixes that backfire" system archetype consists of one balancing loop and one reinforcing loop. The above example is built upon the previous "taking shortcuts to speed up" balancing loop. The higher the "number of shortcuts" taken, after some time, the higher the "number of defects", leading to higher "amount of rework" and slower "speed" which leads to even bigger "pressure". To address the growing pressure we take even more shortcuts. Thus, a reinforcing loop forms. It can be named as "being slowed down by defects". The unintended consequence is the key feature in this archetype. The unintended consequence in this example is the growing number of defects caused by taking shortcuts. Delay plays an important role in this dynamic. It is the delay that makes the unintended consequence surprising.
Shifting the burden
"Shifting the burden" system archetype consists of two balancing loops and one reinforcing loop. If the top-left balancing loop of "taking shortcuts to speed up" is a fix that backfires, is there any alternative balancing loop to help reach the goal? The balancing loop on the bottom-left provides an alternative. When we feel more pressure to speed up, we increase the "intensity of improvement" to improve the speed. Then, the pressure is relieved. There are many possible improvement actions, for example, training and learning to improve skills, strengthening collaboration, removing impediments, etc. It usually takes longer for these actions to become effective than taking shortcuts. However, when shortcuts seem to be working in the short term, it lowers the "motivation of improvement" leading to a decrease of the "intensity of improvement" and slower "speed". We already know that slower "speed" leads to bigger "pressure" and taking more shortcuts. This forms a reinforcing loop, which can be named as "getting addicted to taking shortcuts". The key feature in the "shifting the burden" archetype is the tension between short-term symptomatic solution and long-term fundamental solution. Because of the reinforcing loop, we get addicted to the short-term symptomatic solution.
Eroding goals
"Eroding goals" system archetype consists of two balancing loops. We can define any problem as a gap between a goal and the current reality. The gap could be reduced by changing the reality or by changing the goal. This may sound like cheating , but it is not so uncommon, especially when it takes a long time to actually change the reality. The problem in the above diagram is the progress gap between "target progress" and "actual progress". In order to reduce the "progress gap", there are two solutions, i.e. two balancing loops. The upper loop is the solution for lowering the goal; the bigger the "progress gap", the lower the "target progress", the smaller the "progress gap". In fact, this is simply schedule slippage. The lower loop is the solution for changing the current reality; the bigger the "progress gap", the higher the "intensity of 'speed up' actions", the faster the "actual progress", the smaller the "progress gap". The "speed up" action here does not necessarily mean overtime. Although overtime may help to speed things up to some extent, too much of it will make the progress worse. There are many other "speed up" actions, such as improving efficiency, reducing scope, etc., but these actions usually take longer to become effective. It is this delay that makes us shift gradually to the upper loop. Delay becomes a habit. In a way, "Eroding goal" is a special case of "Shifting the burden". In the "Shifting the burden" archetype, the two balancing loops are both changing the current reality; while in the "Eroding goal" archetype, one of them is changing the goal.
Limits to growth
"Limits to growth" system archetype consists of one reinforcing loop and one balancing loop. The above example is built upon the previous "value drives product growth" reinforcing loop. Potential customers mean people who are targeted by our product, but have not yet become actual customers. Therefore, "potential customers" is the difference between "target customers" and "actual customers". Higher "sales" lead to more "actual customers" which lowers the number of "potential customers" leading to smaller "sales". In other words, sales is limited by the number of target customers. This is usually described as market saturation.
Please note that both "Fixes that backfire" and "Limits to growth" archetypes consist of one reinforcing loop and one balancing loop, but their dynamics are completely different. In the "Fixes that backfire" archetype, the balancing loop is the solution, but it brings an unintended consequence, while the reinforcing loop works as a vicious cycle. In the "Limits to growth" archetype, the reinforcing loop works as a virtuous cycle, while the balancing loop limits it. Therefore, we would like to break the balancing loop so that the reinforcing loop would continue to bring benefits.
4. Modeling Approach
In my Systems Thinking writing, I have included various system models. We shall learn two main approaches of system modeling, applied to organizational design and organizational change, respectively. Organizational design is in fact redesign, as the current design exists regardless of whether it was designed consciously or not. We shall use four questions to drive the modeling analysis for the current design and potential alternatives to reveal deeper understanding and insights about different choices for changes. In organizational change, it is critical to strengthen effectively the driving forces and weaken the restraining forces. We shall do the modeling analysis for driving and restraining factors on the basis of "Limits to growth" system archetype to gain insights.
Organizational design
We use the following four questions to model various factors behind different choices:
- What is the intention of the current choice?
- What is the consequence of the current choice?
- Are there any alternatives that would achieve the same intention?
- Why is it hard to implement those alternatives?
Use the example of "development taking shortcuts" below:
- What is the intention of taking shortcuts? What purpose does it try to achieve? What problem does it try to solve? We can model achieving goal or solving problem as a balancing loop. In this case, it is the B1-loop of "taking shortcuts to speed up".
- What is the consequence of taking shortcuts? It is important to expand the time horizon to see if there are any unintended consequences. Taking shortcuts actually creates negative impact on the goal of speeding up in a longer term, which forms the R1-loop of "being slowed down by defects". The B1 and R1 loops together form the "Fixes that backfire" system archetype. It may also influence other goals, for example quality. Supposing that taking shortcuts does make you faster, but decrease your quality, then it becomes a question on which system goal - speed or quality - is more important to you.
- Are there any alternatives to "speed up"? If "speed up" is the intention behind taking shortcuts, can we take other approach to realize this intention? The B2-loop of "improving efficiency to speed up" is such another approach.
- Why is it hard to implement the B2-loop? Why do we keep taking shortcuts? The R2-loop of "getting addicted to taking shortcuts" provides an explanation.
Through these four questions, we have done a sufficiently comprehensive analysis for the dynamics behind taking shortcuts, paving way for further decision making.
Organizational change
We use two other questions to model organizational change.
- What driving forces can we leverage for a certain change?
- What restraining forces do we need to weaken for a certain change?
This is similar to the common force-field analysis. The difference is that we shall leverage potential reinforcing loops and identify the limits associated with balancing loops, while force-field analysis often stops at linear causation.
Take the example of adopting some practice:
- What would drive the adoption of a practice? Let the team who has adopted it share their experience, communicate with other teams, so as to increase the enthusiasm of other teams. Thus, more teams may start to adopt. There is a reinforcing R1-loop here, which can be named as "sharing to inspire enthusiasm".
- What could restrain the adoption? Team needs coaching while adopting a new practice, but coaching resource may be limited. Thus, we are unable to provide effective coaching for many teams at the same time. There is a balancing B1-loop here, which can be named as "being limited by coaching resource".
The R1 and B1 loops form the "Limits to growth" system archetype. Here, we have only identified one reinforcing loop and one balancing loop. In reality there would be more reinforcing loops to drive the change, as well as more balancing loops to restrain the change.
Conclusion
I hope that this primer could help you to learn and practice systems thinking. Systems thinking is a specialized subject. There are many books available for reference and study. In this primer, I try to use examples in product development to describe some basic concepts that we can apply in many similar situations.
A system, by definition, involves multiple factors interacting with each other. One challenge in applying systems thinking is where to start. It doesn't need to be complicated. Start with the problem, identify some factors (variables) and relationships (links) between them. Look for loops. Loops are key. Multiple loops form various system archetypes. Here I shared two approaches that could be applied for organizational design and organizational change respectively.
P.S. Many thanks to my colleagues - Ivan and Steven - for their great improvement suggestions!