Brand voice and AI-drafted writing, the evaluation pass most teams skip

A long-standing client emailed an owner I work with last month and said her recent updates were “more polished than they used to be, but less like you.” The owner had been using AI to draft emails for four months. The team thought the writing was sharper. The client felt something had gone. Neither was wrong.

The pattern is common enough across the firms I work with that it deserves its own evaluation step. Many teams review AI-drafted writing for accuracy and obvious awkwardness, and stop there. The voice gets a glance, not a check. The result is a slow drift of brand voice toward the generic, fluent register language models default to. By the time anyone inside the firm notices, clients have already started to feel it.

What is a brand-voice evaluation pass?

A brand-voice evaluation pass is a deliberate, structured check applied to AI-drafted writing before it leaves the firm, separate from accuracy review and separate from a full line edit. It asks one question: does this sound like us. The pass runs three quick tests against a known baseline, takes under a minute per short piece, and either approves the draft, sends it back for a rewrite, or flags it for a heavier human pass.

Think of the pass as a screening step, the way a senior partner used to glance at a junior’s letter before it went out. The three tests cover vocabulary, sentence shape and stance, and together they catch the bulk of the drift. The pass assumes the content has already cleared an accuracy check, and that the firm has some baseline sense of how it sounds, even if that is only a paragraph pinned in a shared document.

Why does it matter for your business?

It matters because the drift is predictable, measurable and almost invisible from inside the firm. Language models are optimised for fluency and statistical probability, not for distinctiveness. Stanford research has documented that AI systems flatten human voices toward short, common, broadly safe patterns, and that even speakers in casual settings have started using ChatGPT-favoured words like “meticulous”, “realm” and “adept” up to 51 percent more often than before the model’s public release.

The mechanism is straightforward. A language model picks the next most probable word given everything before it. Trained on internet-scale text, it gravitates to high-frequency vocabulary and to sentence structures that recur across millions of business documents. Researchers call the long-run version model collapse, where synthetic outputs feed back into training and each generation loses more of the distinctive tail of language.

The commercial cost shows up at the wrong end of the client relationship. Berkeley research on authenticity in the age of AI found audiences assess credibility by checking coherence across multiple signals. When the writing stops sounding like the person who used to send it, the client does not always articulate the change, but their sense of the firm shifts. They notice it in renewal conversations, in slower responses, in a quiet reduction in referral. The voice was the trust signal the firm leaned on without naming.

Where will you actually meet it?

You will meet it in the three tests the pass runs, applied to whatever AI has just drafted before it goes out. None of them require special tools. All three can be learned in half an hour and applied in under ninety seconds per piece. The shape of each test below is the version I have found works for owner-managed firms with no editorial function.

Test one, vocabulary. Read the draft and note any word that appears three or more times, and any phrase that sounds like generic business writing rather than your firm’s voice. Common AI tells include “clearly”, “importantly”, “however”, and elevated vocabulary the owner would not reach for. Check against a sample of the team’s pre-AI writing. If the firm habitually writes “we sized the system for thirty percent headroom” and the draft says “we designed it to accommodate future demand with appropriate margin for growth”, the vocabulary has drifted.

Test two, sentence shape. Read the draft aloud and listen for rhythm. AI text at default settings shows what computational linguists call low burstiness: sentences of similar length, structured similarly. Average sentence length is the practical proxy. Human business writing typically averages 15 to 20 words per sentence. ChatGPT at default settings averages 25 to 30. If the draft reads as evenly metronomic, the rhythm has flattened.

Test three, stance. Pick every sentence in the draft that makes a recommendation or takes a position. Ask whether the firm would actually phrase it that way, or whether the AI has softened it. An adviser whose usual style is “the data suggests you should consolidate the providers, here is why” might find the draft saying “some clients have found that consolidating providers offers cost efficiencies, though individual circumstances vary”. The second version distributes responsibility and hedges. Accumulated hedging changes how clients perceive the firm’s authority.

When to ask vs when to ignore

A meaningful share of outbound writing benefits from the pass. A handful of categories should not leave the firm on an AI draft and a voice pass alone, however polished the prose reads. Knowing which is which in advance prevents the high-stakes failures where the editing burden turns out to be 30 to 50 percent of the draft rather than 5 percent.

Regulated communication is the first hard line. If the firm operates under FCA, ICO, GMC, professional indemnity or any equivalent regime, every AI draft that touches on regulatory matters needs heavy human editing, not a quick voice pass. Stanford’s RegLab work found models hallucinate at 69 to 88 percent on specific legal queries. The practical issue is more mundane: models misstate small facts, misread thresholds, and confuse line items. The voice pass cannot catch a wrong threshold, only a wrong tone.

Professional advice the client will act on belongs in the same category. An engineering proposal, a tax position, a clinical letter, a legal opinion. The voice pass can run last, after a senior practitioner has reviewed the content sentence by sentence.

Sensitive personnel and reputation communication is the third. When the writing addresses a mistake, a service failure, or anything that could damage trust if misread, default AI behaviour is wrong by design. Models hedge, distribute responsibility, and soften language in exactly the moments when a credible voice needs to be direct. Expect to rewrite half the draft. For everything else, the three-test pass is the right level of check.

The pass works best when it sits inside three supporting practices that compound over time: a baseline, a cadence, and a small restraint about what the firm chooses not to AI-draft. Each takes hours to set up and minutes a week to maintain. Together they turn the voice pass from a one-off scan into a quiet operating habit the team applies without thinking about it.

The baseline is the anchor. Pull twenty to thirty recent pieces of client-facing writing from your team, from before AI assistance became routine, and read them in one sitting. Note the dominant tone, the characteristic vocabulary, and the way recommendations get framed. Write a short paragraph capturing what you found. Pin it where the team can see it. The Acrolinx tone-of-voice guide gives a workable template if you want one.

The cadence is tiered. Every client-facing piece gets the voice pass once before sending. Internal writing gets a weekly sampling of three to five pieces pulled at random. Every ninety days, run a drift audit on ten or twelve representative pieces and look for trends. If vocabulary is narrowing, sentence shapes flattening or recommendations softening, the next step is usually a baseline re-brief rather than a tooling change.

The restraint is the harder one. There are categories of writing the firm should choose not to AI-draft, regardless of how the pass would score them. Founder communications to long-standing clients, board updates, condolence messages, anything where the relationship rests on the personal voice of the sender. Editing the voice back in is more work than writing the piece once.

The conversation that prompts this work is the one the owner heard from her client. More polished than it used to be, but less like you. If you want to talk through what a voice pass looks like inside your own firm, book a conversation.

Brand voice and AI-drafted writing, the evaluation pass most teams skip

Key takeaways

What is a brand-voice evaluation pass?

Why does it matter for your business?

Where will you actually meet it?

When to ask vs when to ignore

Sources

Frequently asked questions

How long does the voice pass actually take per piece of writing?

We do not have a brand-voice guide. Do we need to write one first?

Is this not just an extra editing step we cannot afford?

Ready to talk it through?

If any of this sounds familiar, let's talk.

Brand voice and AI-drafted writing, the evaluation pass most teams skip

Key takeaways

What is a brand-voice evaluation pass?

Why does it matter for your business?

Where will you actually meet it?

When to ask vs when to ignore

Related concepts

Sources

Frequently asked questions

How long does the voice pass actually take per piece of writing?

We do not have a brand-voice guide. Do we need to write one first?

Is this not just an extra editing step we cannot afford?

Ready to talk it through?

Related reading

Quality signals over time, how to spot when AI output is drifting

The two-person review threshold, when single-check AI evaluation is not enough

Sampling rates for AI output, what the volume should drive

If any of this sounds familiar, let's talk.