The owner I am thinking of caught it in a phrase. Her marketing lead had been describing everything that came out of the team as “AI-assisted”, and the owner had let the term slide for a few months. Then she sat with two pieces side by side. One was a memo her senior consultant had written and run through a polishing tool. The other was a 1,200 word LinkedIn article a junior had generated from a single prompt and lightly edited. The team called both “AI-assisted”. The owner could see they were not the same thing at all, and that the second one needed a different kind of review the team had not been doing. She asked me whether the distinction was worth making a rule about.
It is. The two paths produce different output, carry different categories of risk, and demand different evaluation thresholds. Treating them as one category leads to predictable failures, and the fix is small enough that an owner can put it in place over a single team meeting.
What is the difference between AI-edited and AI-drafted writing?
AI-edited writing means a human wrote the first version and an AI tool polished or tightened it afterwards. AI-drafted writing means the AI produced the first complete version from a prompt and a human then accepted, edited or rewrote what it generated. The difference matters because in the first path the human owns the facts, and in the second the model does until verified.
In the edited path, the human has already committed to a factual claim by writing it down, and the AI is operating on existing prose. In the drafted path, every claim in the output came from the model’s pattern matching across its training data, and the human reading it has no way to know which claims are anchored in real sources and which are confident fabrications. Stanford HAI’s 2024 AI Index documents hallucination rates on frontier models that vary by task but remain meaningful at scale. Pangram Labs’ detection work confirms that the distinction between edited and drafted prose is not reliably caught by automated tools either. The label has to come from the workflow itself.
Why does the distinction matter for the evaluation threshold?
The two paths carry different risk profiles, so the human review that catches each one looks different. AI-edited content carries regression risk, where a polish accidentally weakens a claim, strips a qualifier or softens the author’s voice. AI-drafted content carries invention risk, where the model fabricates statistics, sources or quotations with the same fluency as it cites real ones. The thresholds need to be calibrated to each.
A sentence like “the supplier could meet the deadline if additional resources were available” gets tightened by an editing tool to “the supplier could meet the deadline”, and the conditional clause that changes the meaning is gone. The check is a side-by-side reading that takes 5 to 10 minutes per piece, focused on whether the specifics survived the edit. AI-drafted content is heavier. A 1,500 word piece may contain 15 to 25 factual claims, each capable of being a hallucination. The Harvard Business Review’s editorial guidance on AI-generated content notes that verification often takes longer than writing the piece manually, because the human cannot trust their own knowledge and has to check every significant claim against an external source. NIST’s AI Risk Management Framework places the responsibility for that verification on the deployer of the system, and the ICO’s UK GDPR guidance does the same for any output that touches personal data.
Where do owners actually meet this distinction in daily work?
You meet it at the points where AI output crosses a boundary, into a client’s inbox, into a published article, into a proposal, into a forecast that shapes a hiring decision. The boundary is the same in both paths, but the question being asked at the boundary is different. For edited content the question is whether the edit preserved meaning. For drafted content the question is whether the model invented anything.
The conflation problem usually shows up the same way. A team uses ChatGPT to draft a piece and Grammarly to clean it up. Both are AI tools, so the output gets filed as “AI-assisted” and runs through a single review gate. That gate is almost always calibrated for the lighter task. The drafted material slips through with no verification of the claims it contains, and the edited material gets reviewed too aggressively, with the team spending time hunting for invented facts that the human author never created in the first place. The asymmetry is invisible until someone reads two pieces back to back and sees what the single label has been hiding.
When does each path need a heavy threshold and when does a light one work?
It depends on the stakes of the piece. High-stakes content is anything that could damage credibility, affect a client relationship, create regulatory exposure or speak on behalf of a named individual or the firm. Thought leadership, client proposals, public statements and regulatory communications all sit there, and AI-drafted material in this band should not enter publication without expert verification of the primary factual claims.
AI-edited material in the high-stakes band needs the regression check and a voice consistency review, because the factual foundation is the human author’s already. Medium-stakes content includes client background documents, technical memos for internal review, training materials and proposal appendices. For these, AI-drafted material is acceptable if the subject-matter owner has confirmed the major factual claims are reasonable and current. AI-edited material gets the regression check on its own. Routine content includes social captions where no specific factual claim is being made, internal scheduling notices and refreshed evergreen content. For these, AI-drafted material is acceptable if it is verified once for accuracy and then reused in standard form, with a quarterly spot check for outdated references. The point of the bands is that the team knows which threshold applies before the piece is produced, not after it has gone out.
What is the three-question check that surfaces which path is in use?
Three questions, asked of any piece before it enters the approval gate, separate the two paths reliably. Did a human write this before AI touched it. If AI wrote the first version, what was the human’s role afterwards, minor edits, major rewrites, or accepting most of it as-is. What are the factual claims in this piece, and does the person who approved it have direct knowledge of the source material.
The answers to those three questions tell a marketing lead which label to attach and tell the owner reading the label which review depth to expect. The team rule that holds it all together is simple. Drop “AI-assisted” as a category. Use AI-edited when the human is the primary author and AI was used for refinement. Use AI-drafted when AI produced the first substantial version. Attach the label to each piece at the point of approval. The labels make the right evaluation threshold visible at the moment it matters.
If you want to talk through how to embed that discipline in your own team’s operating rhythm without it becoming bureaucratic, book a conversation.



