She has been writing the same quarterly report for three years. Twelve pages, four colour charts, a one-page executive summary on top. It goes out to nine people, six of whom have never replied to it. The other three reply with one-line acknowledgements. She has assumed for those three years that someone, somewhere, must be reading the rest of it carefully. She has never actually tested the assumption.
That gap is the post. The most expensive question for any recurring activity in your week is also the simplest, and you can run it with a chat window open in front of you in about fifteen minutes.
What is the stress test, in plain terms?
The stress test is a counterfactual elimination prompt. You name a recurring activity in your week, name a four-week pause on it, and ask AI to enumerate every plausible failure mode with severity, who notices, and on what timeline. The prompt forces specifics: not ‘this might cause problems’, but ‘on day eight, the ops director notices the missing dashboard’. The activity has nameable failure modes, or it does not.
The lineage runs through Gary Klein’s pre-mortem method from the September 2007 Harvard Business Review, run in reverse. Klein’s pre-mortem imagines a future project has failed and asks why; the stress test imagines a present-day activity has stopped and asks what breaks. Same intellectual move, opposite direction. Charlie Munger’s ‘always invert’ rule, popularised in the Farnam Street write-up, is the methodological grandfather of both.
Why is the AI version more useful than thinking it through alone?
Three things happen in your head that do not happen in a chat window. You skip the specifics, thinking ‘someone might complain’ instead of naming who and when. You get bored after three items, where the model produces fifteen and ranks them. And your worried head loops back to ‘but what if’, the catastrophising reflex Kahneman, Sibony and Sunstein document at length in Noise. The model enumerates and stops.
The Annie Duke version of this argument, in Quit, is that we are systematically bad at the keep-or-kill decision because sunk cost and status quo bias both push us in the same direction. The stress test is the structural counterweight. It forces the question, then forces the specifics. You can run it on yourself with a notebook, but the discipline is hard to hold for an hour, and the model holds it for as long as you keep typing. That is the difference between a thought experiment and a usable one.
Where will you actually meet it on your desk?
The natural targets are recurring activities you have inherited or instituted and never re-tested. Weekly internal meetings, monthly all-hands, quarterly reports, internal newsletters, status updates to investors who stopped reading them, dashboards that no longer drive decisions, standing one-to-ones with people whose role has changed. Anything on your calendar that repeats, where you cannot in plain English say what specifically breaks if you paused for a month.
The test surfaces three patterns. The first is real and recoverable: the activity has a genuine failure mode, but you can downgrade rather than kill. The weekly forty-minute meeting becomes a fifteen-minute async update. The full report becomes a one-page summary. The second is no failure mode at all: when you ask the model to name what specifically breaks, it cannot, and on reflection neither can you. That is a kill. The third is the pattern much of the stress-test literature collapses into the first two, and it is worth spotting precisely.
A worked example helps. The quarterly report from the opening anchor: nine recipients, twelve pages, four colour charts. Run the prompt and the model returns three findings. Two of the named recipients use one specific table on page four for budget meetings. None of the others would notice the report disappearing. The right move is not to kill the report or to keep it, but to send the one table the two recipients use, and stop producing the other eleven pages. Forty minutes a quarter, recovered, with the actual users better served.
When is the answer ‘someone gets uncomfortable’?
The third pattern is the one where the only plausible failure mode the model returns is that a specific person would feel slighted, unseen, or out of the loop. That is a relationship signal in operational clothing, and treating it as a logistics question is how you damage a relationship by accident.
The standing one-to-one with a long-tenured colleague is the canonical example. Operationally, four weeks pausing it would change nothing measurable. Relationally, the message of pausing it is ‘I no longer think this conversation is worth my time’, which may be the opposite of what you mean. The right move there is to have the conversation directly with the person about what the meeting is actually for. Either you arrive at a fresh shape that earns its place in the calendar, or you both agree it has served its purpose. The stress test surfaced the question; the answer was always going to be relational.
The same applies to quarterly reports going to a key client, monthly check-ins with a co-founder, and any standing communication that exists more to signal care than to convey information. AI can flag that the only failure mode is discomfort. AI cannot weigh whether that discomfort is load-bearing or vestigial. That weighting is your job.
What should you do after the model has given you its list?
Calibrate empirically. The stress test gives you a hypothesis, not a verdict. Pause the activity for two weeks, not four, and watch what actually happens. The list the model produced is your reference: did the day-eight failure it predicted actually happen, did anyone ask, did any decision quality drop. The chasm between predicted and observed failure modes is where the real elimination decisions get made.
For the activities that pass the empirical test, kill them and reclaim the time. For the activities where a real but downgradeable failure mode showed up, run the downgrade and watch again. For the activities where the only signal was relational, schedule the conversation you have been avoiding rather than letting the calendar carry it indefinitely. None of this is dramatic on its own. Stack the answers across a quarter and the calendar starts to look very different. That is the Eliminate quadrant of EAD-Do doing its job, with the model holding the discipline you cannot reliably hold alone.
If you want to run the stress test against your own week, with a second pair of eyes on which results to act on first, book a conversation.



