The 'what would happen if I didn't' stress test

A founder at a kitchen table reading a chat window on her laptop, with a printed weekly calendar and a circled entry beside her
TL;DR

The 'what would happen if I didn't' stress test is an inversion-style elimination prompt: name a recurring activity, name a four-week pause, ask AI to enumerate plausible failure modes with severity. AI is better at this than founders thinking alone because it forces specifics, does not get bored, and does not catastrophise. It surfaces three patterns: real failures worth a downgrade, no failures worth a kill, and discomfort signals that are relationship questions.

Key takeaways

- The stress test is a counterfactual elimination prompt: name a recurring activity, name a four-week pause, ask AI to enumerate plausible failure modes with severity. It is the inverse of Gary Klein's pre-mortem from the September 2007 Harvard Business Review. - AI runs this test better than a founder thinking alone because it forces task-by-task specifics, does not tire after the third item, and does not loop back to 'but what if' the way a worried operator's head does. - The first pattern the test surfaces is recoverable failure: a real downside exists but you can downgrade the activity rather than kill it. The second is no failure mode at all, which is a kill. - The third pattern is the one most stress-test content collapses. The only failure mode is 'someone gets uncomfortable', which is a relationship signal, not a business one. Treat it as a conversation to have, not an activity to keep. - The follow-on test is empirical. Pause the activity for two weeks. Watch what actually happens. Calibrate the model's enumeration against reality, then make the keep-or-kill call with information you did not have before.

She has been writing the same quarterly report for three years. Twelve pages, four colour charts, a one-page executive summary on top. It goes out to nine people, six of whom have never replied to it. The other three reply with one-line acknowledgements. She has assumed for those three years that someone, somewhere, must be reading the rest of it carefully. She has never actually tested the assumption.

That gap is the post. The most expensive question for any recurring activity in your week is also the simplest, and you can run it with a chat window open in front of you in about fifteen minutes.

What is the stress test, in plain terms?

The stress test is a counterfactual elimination prompt. You name a recurring activity in your week, name a four-week pause on it, and ask AI to enumerate every plausible failure mode with severity, who notices, and on what timeline. The prompt forces specifics: not ‘this might cause problems’, but ‘on day eight, the ops director notices the missing dashboard’. The activity has nameable failure modes, or it does not.

The lineage runs through Gary Klein’s pre-mortem method from the September 2007 Harvard Business Review, run in reverse. Klein’s pre-mortem imagines a future project has failed and asks why; the stress test imagines a present-day activity has stopped and asks what breaks. Same intellectual move, opposite direction. Charlie Munger’s ‘always invert’ rule, popularised in the Farnam Street write-up, is the methodological grandfather of both.

Why is the AI version more useful than thinking it through alone?

Three things happen in your head that do not happen in a chat window. You skip the specifics, thinking ‘someone might complain’ instead of naming who and when. You get bored after three items, where the model produces fifteen and ranks them. And your worried head loops back to ‘but what if’, the catastrophising reflex Kahneman, Sibony and Sunstein document at length in Noise. The model enumerates and stops.

The Annie Duke version of this argument, in Quit, is that we are systematically bad at the keep-or-kill decision because sunk cost and status quo bias both push us in the same direction. The stress test is the structural counterweight. It forces the question, then forces the specifics. You can run it on yourself with a notebook, but the discipline is hard to hold for an hour, and the model holds it for as long as you keep typing. That is the difference between a thought experiment and a usable one.

Where will you actually meet it on your desk?

The natural targets are recurring activities you have inherited or instituted and never re-tested. Weekly internal meetings, monthly all-hands, quarterly reports, internal newsletters, status updates to investors who stopped reading them, dashboards that no longer drive decisions, standing one-to-ones with people whose role has changed. Anything on your calendar that repeats, where you cannot in plain English say what specifically breaks if you paused for a month.

The test surfaces three patterns. The first is real and recoverable: the activity has a genuine failure mode, but you can downgrade rather than kill. The weekly forty-minute meeting becomes a fifteen-minute async update. The full report becomes a one-page summary. The second is no failure mode at all: when you ask the model to name what specifically breaks, it cannot, and on reflection neither can you. That is a kill. The third is the pattern much of the stress-test literature collapses into the first two, and it is worth spotting precisely.

A worked example helps. The quarterly report from the opening anchor: nine recipients, twelve pages, four colour charts. Run the prompt and the model returns three findings. Two of the named recipients use one specific table on page four for budget meetings. None of the others would notice the report disappearing. The right move is not to kill the report or to keep it, but to send the one table the two recipients use, and stop producing the other eleven pages. Forty minutes a quarter, recovered, with the actual users better served.

When is the answer ‘someone gets uncomfortable’?

The third pattern is the one where the only plausible failure mode the model returns is that a specific person would feel slighted, unseen, or out of the loop. That is a relationship signal in operational clothing, and treating it as a logistics question is how you damage a relationship by accident.

The standing one-to-one with a long-tenured colleague is the canonical example. Operationally, four weeks pausing it would change nothing measurable. Relationally, the message of pausing it is ‘I no longer think this conversation is worth my time’, which may be the opposite of what you mean. The right move there is to have the conversation directly with the person about what the meeting is actually for. Either you arrive at a fresh shape that earns its place in the calendar, or you both agree it has served its purpose. The stress test surfaced the question; the answer was always going to be relational.

The same applies to quarterly reports going to a key client, monthly check-ins with a co-founder, and any standing communication that exists more to signal care than to convey information. AI can flag that the only failure mode is discomfort. AI cannot weigh whether that discomfort is load-bearing or vestigial. That weighting is your job.

What should you do after the model has given you its list?

Calibrate empirically. The stress test gives you a hypothesis, not a verdict. Pause the activity for two weeks, not four, and watch what actually happens. The list the model produced is your reference: did the day-eight failure it predicted actually happen, did anyone ask, did any decision quality drop. The chasm between predicted and observed failure modes is where the real elimination decisions get made.

For the activities that pass the empirical test, kill them and reclaim the time. For the activities where a real but downgradeable failure mode showed up, run the downgrade and watch again. For the activities where the only signal was relational, schedule the conversation you have been avoiding rather than letting the calendar carry it indefinitely. None of this is dramatic on its own. Stack the answers across a quarter and the calendar starts to look very different. That is the Eliminate quadrant of EAD-Do doing its job, with the model holding the discipline you cannot reliably hold alone.

If you want to run the stress test against your own week, with a second pair of eyes on which results to act on first, book a conversation.

Sources

- Klein, Gary (2007). "Performing a Project Premortem", Harvard Business Review September 2007. The canonical reference for the pre-mortem method, which the stress test inverts. https://hbr.org/2007/09/performing-a-project-premortem - Klein, Gary. Personal site, premortem page. Klein's own articulation of the technique with method notes. https://www.gary-klein.com/premortem - Farnam Street (Shane Parrish). "Inversion: The Crucial Thinking Skill Nobody Ever Taught You". Cited as the methodological grandfather (Charlie Munger's 'always invert' applied to recurring activity). https://fs.blog/inversion/ - Duke, Annie (2022). Quit: The Power of Knowing When to Walk Away. Portfolio/Penguin. Cited as the most directly applicable single source on knowing-when-to-stop, the decision the stress test surfaces. https://www.porchlightbooks.com/products/quit-annie-duke - Duke, Annie (2018). Thinking in Bets. Portfolio/Penguin. Cited for the decision-quality framing under which counterfactual tests are run. https://www.porchlightbooks.com/products/thinking-in-bets-annie-duke - Kahneman, Daniel; Sibony, Olivier; Sunstein, Cass (2021). Noise: A Flaw in Human Judgment. William Collins. Cited as the cognitive-bias backdrop on why founders cannot reliably run this test in their head. https://us.macmillan.com/books/9780316451406/noise/ - Psychology Today (2018). "Counterfactual Thinking" explainer. The cognitive science backdrop on counterfactual reasoning, which the stress test operationalises. https://www.psychologytoday.com/us/blog/the-science-behind-behavior/201803/counterfactual-thinking - Psychology Today. "Sunk Cost Fallacy" reference. Cited as one of the two biases the stress test is designed to neutralise. https://www.psychologytoday.com/us/basics/sunk-cost-fallacy - Psychology Today. "Status Quo Bias" reference. Cited as the second bias the stress test is designed to neutralise. https://www.psychologytoday.com/us/basics/status-quo-bias - Gregersen, Hal (2018). "How to ask great questions", MIT Sloan Ideas Made to Matter. Cited for the methodology of asking the right counterfactual question, which is the prompt skill the stress test depends on. https://mitsloan.mit.edu/ideas-made-to-matter/how-ask-great-questions

Frequently asked questions

How is this different from a Gary Klein pre-mortem?

A pre-mortem imagines the project has failed and asks why. The stress test imagines the activity has stopped and asks what breaks. Klein's 2007 HBR piece runs the test forward against a future failure; this runs it backwards against present-day work that may have outlived its usefulness. Same intellectual lineage, opposite direction. The pre-mortem belongs in the Do quadrant of EAD-Do (planning a new initiative). The stress test belongs in Eliminate.

What prompt actually works for this?

Keep it concrete. 'I run a weekly internal team update on Mondays, 30 minutes, six people. If I paused this for four weeks, list every plausible failure mode by severity. Name what specifically breaks, who notices, and on what timeline.' AI returns ten to fifteen items, ranked. You then triage: real and recoverable, real and serious, no failure mode, or 'someone gets uncomfortable'. The prompt's job is to force the model into specifics, not to ask it for advice.

When does this test give the wrong answer?

When the activity's value is relational rather than operational, and you let the model collapse the relationship into a logistics question. Standing meetings with a co-founder, a one-to-one with a long-tenured employee, a quarterly call with a key customer, all of these can pass the four-week-pause test on operational grounds and still be load-bearing on relationship grounds. AI can flag the discomfort signal but cannot weigh it for you. That weighting is your job.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation