Thinking fast and slow about ITTs
A colleague responded to a blog post I did the other day about Daniel Kahneman’s excellent psychology book “Thinking fast and slow”, and we got to talking about a very practical application of Kahneman’s theories in the way we make big purchasing decisions in our company. But actually this approach to decision-making applies to lots of other areas in work and in life: my son choosing a University, which product to buy on Amazon, where to go on holiday.
ITTs
Have you ever been involved in an Invitation To Tender (ITT) process? It could have been responding to an ITT (where your customer is considering your company as their supplier) or issuing an ITT (where you are doing a big purchase from another company). Either way, I wonder what your feelings were about the process?
I’ve responded to some pretty big ITTs in my time as a solutions architect, and my experience is that the process was cumbersome and it was rare that we could say we had done a really good job in responding. My brain could barely cope with the hundreds or thousands of questions asked. There was never enough time to do the job properly.
I’ve also been involved in evaluating ITT responses. Each one usually runs to hundreds of pages, and I was expected to digest all of these, scoring them on hundreds of points – sometimes for every question. It was daunting, caused major brain overload, and I wonder if good decisions came out of it.
To make it easier, the questions sometimes were in a spreadsheet with a “Full compliance, partial compliance, or non-compliance” score claimed by the supplier. Of course when responding, the goal was to interpret each question so you were seen to be as compliant as possible. The result was that each supplier claimed compliance with almost everything. With so much information it became hard to see the difference, and to see the reality behind the apparently compliant responses. How do you judge between them?
My colleague emailed me his reflections on a particular ITT process he went through, and how we could improve things by “Thinking fast and slow” about ITT responses. It’s very pertinent to me at the moment, because we’re just at the beginning of a lengthy ITT process.
Like me, he is a prolific blogger in BT, usually with somewhat more of a technology focus than me. With his permission, I’ve pasted some of what he said below (obscuring the supplier names and adding some pictures). A semi guest post. I think you’ll find it interesting. He says:
“The Current Problem
Making decisions about the outcome of an ITT is a vital process. Experience, at least in the IT space, is that the process is a cumbersome one, and fails to use the best evidence and practice about how people actually make decisions.
If we take once recent ITT – a well-run and orderly affair, with a good outcome – we can see a number of aspects:
- The scoring mechanism has taken many working weeks of effort to complete, with each number revised and re-revised, weightings modified, and overall totals tallied;
- It is likely that only two individuals have fully reviewed all the numbers in the scores;
- The scoring has in fact been secondary to our judgements on the overall effect of the proposals to our ability to deliver improved service over the long term at reduced cost;
- Scoring has not been able to properly reflect identified showstoppers;
- Scoring has not been highlighted the real market and technical position of some of the suppliers.
Let’s look at Supplier A, for example. They offered us something that sits very well against what we asked for in terms of being a software-based hardware-agnostic platform. However it doesn’t work for us: it doesn’t support provision of what we need to lots of our type of server – a non-starter.
In contrast, Supplier B offered us a device that is nothing like what we asked for but highly competent and in practice could be deployed, and they were still in the development process with that device. Another couple of suppliers did well because of their capabilities in one technical area but the ITT in fact never asked for that.
It’s clear that we make decisions not using the scoring process itself but through other mechanisms and use the scoring process only as a cumbersome vehicle to express those decisions. We need to look at how we really do work.
How People Make Decisions
I’m going to lean very heavily on Professor Daniel Kahneman’s work. Kahneman is a Nobel laureate whose lifetime has been spent in studying how people actually make decisions and the mechanisms that can be used to help do this. “Thinking, Fast and Slow” (ISBN-10: 0141033576) contains the material I’ll be using.
There are many ideas in Kahneman’s work, with many aspects of thought covered in the text. I will argue here that there are two important ideas. The first is that thought and decision-making has two different systems (his System 1 and System 2, which correlate exactly to his Fast Thought and Slow Thought of the title), both of which have validity and both of which need to be used in good decision-making. Sometimes it’s impossible to use both, but knowing they both exist and something about how to make them work is a start.
Fundamentally people have two systems by which they think. System 1 works very rapidly and is unaware of many aspects of a decision – it is biased by what is front of mind (the “information availability” factor), does not understand statistics, can be tricked by expectation management and “anchoring”, but is very good at combining a complex set of factors and getting to a decision.
System 2 in contrast takes its time. It is able to weigh factors and (if it is aware of these aspects of thought) can make allowances for tricks, can apply statistical analysis, can go back over information and remember past experience and new factors; but is much less decisive than System 1.
I propose that we use System 1 and System 2 by marking ITT responses twice: once on first reading, and once after proper analysis. However the way we do the marking is critical, and depends on the second aspect of Kahneman’s work.
He shows, building on others’ work, that complex approaches to trying to classify an important decision simply doesn’t work as effectively as people think it does, and that there is another way to do things.
There is a great deal of content in the book and I would fail at trying to summarise it. But it boils down to this: trying to score every result for every question and apply a weighting to it is ineffective as a mechanism for decision-making.
In contrast, he puts forward an approach that has been widely used and proven. A good example here is the Apgar test, used for identifying health of a new-born baby. Doctor Virginia Apgar realised that there was no single simple way to judge whether a new-born was healthy or not. Some doctors didn’t really look at the baby at all, or only looked at one or two factors, while others applied complex tests that took time and made a simple urgent decision a complex and untimely process.
To counter this inconsistent and poor decision making, Apgar proposed that each child would be given a score on only five factors, which were latterly given the title “Activity, Pulse, Grimace, Appearance, and Respiration” (at once making her name immortal and her existence as a person largely forgotten.) This mechanism introduced a simple, consistent, timely and complete way to judge whether a new-born needed urgent intervention, observation, or is essentially healthy.
APGAR is used in an urgent scenario but Kahneman shows very similar characteristics in other much longer-running decision-making processes, such as investment fund management. Adding complexity slows us down, and makes our behaviour in making decisions more obscure.
This is all very interesting but what does it mean for an ITT process? We are certainly not going to boil down an ITT to five or eight requirements. We still need to get a full set of requirements defined. However my proposal is that we mark the responses based on a summary set of what is actually important to us – with the responses from an ITT being used (twice!) to give us the marks.
Many of these overall considerations will be consistent across ITTs:
- Can we Deploy it? [D]
- Is the solution Robust? [R]
- Can we Operate it? [O]
- Do we believe we can work with the suPplier? [P]
- Is the proposal Complete? [C]
- Will it Improve customer service? [I]
- Will it reduce our Costs? [C]
- Is the proposal a good Strategic fit? [S]
This is then the DROPCICS process. The dropkick is taken, judged, and then reviewed again in slow motion.
We can of course look at each of these as they are my initial take but my proposal is that we give each response a mark out of ten for each of these factors. However we need to consider each of the numbers in turn, not just the total: any solution that scores less than 5 on any mark is almost certainly unacceptable except perhaps as a niche player. The total may then be used to compare.
In our first review – the System 1 review – we simply read the response and assign the marks at the end. Those numbers are then fixed.
In our second, System 2, review we work through and attempt to formulate a response based on a more complete review. These numbers can be changed through the process. We would use the System 1 results as a focus for specific issues; the less-than-5 scores need to be validated as correct.
At the end of the process we primarily use the System 2 results, as this will have all the information learned throughout the process. However we will then go back to our System 1 results and look to identify the discrepancies. It may be that the differences are easily understood, but we need to be aware that our System 1 is proven to be a highly capable complex decision-making engine.
If we look at the recent ITT on this basis, many of the factors that were problems would come through very clearly. One of the solutions was undeployable (and inoperable, too, in fact). Another solution was incomplete and not a good strategic fit. So System 1 would have highlighted this quickly, given us a focus to check we’re right and let us move on to other questions. System 2 would then get us to a simple set of numbers based on real due diligence and judge the outcome openly without hiding the process in a thousand weighted numbers.
Summary Process
- Identify the small number of factors we are going to mark upon.
- Read each response and give a System 1 mark out of 10 for each of the factors.
- Work through the process of understanding the detail, with clarifications, corrections and full detailed analysis, illuminated by System 1 marking, and come out with a System 2 mark.
- Identify the reasons for discrepancies between the two sets of marks and agree a final mark for each, primarily based on System 2 but with System 1 factored in as appropriate.
- Agree whether there is a reason why a response with a single low score should be left in the process.
- Use the marks to agree a final outcome.”
So what do you think of the DROPCICS proposal? Is it workable? Have you ever done decision-making that way? Have you any other suggestions to improve how we deal with ITTs?