Where your data goes when you paste into a chatbot

It is Friday afternoon. An owner is sitting with a coffee, three days after a busy Tuesday, and a thought arrives that will not leave. On Tuesday she pasted a draft client proposal into free ChatGPT to tighten the language before sending it. The proposal had the client’s name, the figures, and a paragraph that referenced a sensitive piece of context the client had only shared in confidence. She does not remember whether she was logged into a paid account or a free one. She does not know whether that text is now sitting in a database, on a server somewhere, being read by anyone, or training a future model. She would rather not think about it. She is thinking about it anyway.

This is the position a meaningful share of owners are sitting in right now. The chatbots are useful, the chatbots are fast, and the question of what actually happens to the text after the send button has never quite been answered in plain English. The answer matters, because the controls available depend on it, and the controls vary widely between providers and tiers.

What actually happens when you paste text into a chatbot?

The text leaves your browser the moment you hit return. It travels over an encrypted connection to the vendor’s servers, gets tokenised into numerical fragments the model can read, runs through the model on a GPU somewhere in their infrastructure, and the reply comes back. Both prompt and response are stored in a conversation database the vendor controls. The round trip takes seconds. The data is now under their retention rules, not yours.

The first thing to absorb is that the browser is not doing the work. The model is not on your laptop. Every byte of the prompt has crossed the public internet, landed on the vendor’s GPUs, and been written to a database before you see the reply. The second thing is that storage is a deliberate feature of the product. Conversation history is what lets the chatbot remember the last thing you asked. It also means a regulator asking what data left your firm last Tuesday can be answered, but only by the vendor, and only on the vendor’s terms.

Why does this matter for your business?

UK GDPR and the Data Use and Access Act 2025 do not stop applying because the data sits inside a chatbot prompt. If you process personal data of clients, staff, or customers, you are the controller. Pasting that data into a tool that trains on your conversations is processing without a lawful basis. The ICO has been explicit since 2023 that AI tools sit inside its enforcement remit.

The exposure scales with what is in the prompt. A draft blog post is one thing. A client proposal with named figures is another. A board pack with an acquisition target is in a different category entirely. Cyberhaven’s 2026 research found that 39.7% of AI interactions in surveyed organisations involve sensitive data, and 32.3% of ChatGPT use happens through personal accounts that bypass any enterprise controls the firm has set. That is the median pattern in working firms today.

The platform is not your data processor unless you hold a Data Processing Agreement with them, and consumer tiers do not provide one. Without a DPA, every prompt containing client data is a fresh exposure to an ICO complaint or a client breach claim.

Where will you actually meet it: the four tiers that change everything

Free and consumer-paid tiers (ChatGPT Plus, Claude Pro, Gemini Advanced, Copilot Pro) are aimed at individuals. Business tiers (ChatGPT Business, Claude Team, Gemini for Workspace, Microsoft 365 Copilot) are aimed at firms. Enterprise tiers add data residency, audit logs, and a contractual DPA. The defaults flip sharply between consumer and business tiers, and that flip is where most of the legal exposure for an SME sits.

ChatGPT free and Plus train on your conversations by default and store them indefinitely on US infrastructure. The opt-out exists, but it is buried in settings at privacy.openai.com and many users have never found it. Claude free and Pro, since Anthropic’s October 2025 policy change, do not train on consumer chats by default and retain them for thirty days. Gemini consumer tiers do not train on prompts but auto-delete activity after eighteen months. Copilot Pro does not train but offers limited residency control. None of these consumer tiers give you a DPA, so for any prompt containing personal data they are technically off-limits if you are GDPR-bound.

Business and enterprise tiers reverse the defaults. ChatGPT Business, Claude Team, Gemini for Workspace, and Microsoft 365 Copilot all default to no-training, configurable retention, and a contractual DPA. Microsoft 365 Copilot and Gemini for Workspace also offer EU Data Boundary, meaning the data stays inside the EU. These are the tiers a UK SME actually needs if it is putting any client or staff data into a chatbot. The decision is covered in more detail in the paid LLM tier data risk decision, which walks through the £150-a-month threshold that usually flips the calculation.

When to ask vs when to ignore: the four content categories that should never enter a chatbot

Live credentials, regulated personal data, board-confidential financial figures, and M&A information should not be pasted into any consumer-tier chatbot regardless of opt-out settings. The opt-out only controls future training. It does not control who at the vendor can read the conversation during inference, and it does not undo data already used in a completed training run. For these four categories, the right question is whether the tool needs to see the data at all.

The Samsung incident in 2023 is the canonical case. Employees pasted proprietary semiconductor design specifications and internal source code into free ChatGPT to debug them. That code became part of OpenAI’s training corpus. The data could not be recalled. Samsung temporarily banned ChatGPT internally, then reintroduced it with strict classification rules. The same exposure applies in miniature for an SME pasting a client’s regulated personal data, a partner’s compensation figures, or the name of an acquisition target into a consumer chatbot. Once the prompt is sent, the only recovery is the vendor’s deletion policy, and deletion does not reach the trained model. Categorising data before it goes near a chatbot is covered in the four-tier data classification for AI, which gives the working SME version of the policy.

Everything else, marketing copy, anonymised process descriptions, public-domain background research, the wording of a generic email, is fine in any tier with the opt-out set correctly. The category line, not the tool, is what matters.

How to verify your own setup in three places

Before you trust any chatbot with anything sensitive, find three settings in the account: the training opt-out, the retention period, and the data residency region. If you cannot find all three, the tool is not configured for business use. The training opt-out controls a future model. Retention controls how long the conversation sits in the database. Residency controls which jurisdiction’s laws apply to the data right now.

For ChatGPT, the training opt-out is at privacy.openai.com and the residency setting only exists on Business and Enterprise plans. For Claude, the training toggle sits under Settings, Privacy, “Help improve Claude”, and residency control only exists on Enterprise. For Gemini, activity controls and the EU Data Boundary toggle both live in the Workspace admin console. For Microsoft 365 Copilot, residency and audit logging are configured in the Microsoft 365 admin centre under Copilot settings.

The setting that matters most for a UK SME is residency. Data residency determines whether your data is subject to US government access under the CLOUD Act, or stays inside the UK or EU under GDPR. For any tool you intend to use with client data, that setting is non-negotiable, and it only exists on the business and enterprise tiers.

If you are looking at the chatbot landscape from the inside of a small firm and you are not sure which tier you are on, which provider you should be on, or what to do about the prompts that have already gone out, book a conversation.

Where your data goes when you paste it into a chatbot

Key takeaways

What actually happens when you paste text into a chatbot?

Why does this matter for your business?

Where will you actually meet it: the four tiers that change everything

When to ask vs when to ignore: the four content categories that should never enter a chatbot

How to verify your own setup in three places

Sources

Frequently asked questions

If I'm using free ChatGPT for client work, what's actually happening to that data?

Is Claude actually safer than ChatGPT for an SME?

I have already pasted sensitive things in. What now?

Ready to talk it through?

If any of this sounds familiar, let's talk.

Where your data goes when you paste it into a chatbot

Key takeaways

What actually happens when you paste text into a chatbot?

Why does this matter for your business?

Where will you actually meet it: the four tiers that change everything

When to ask vs when to ignore: the four content categories that should never enter a chatbot

How to verify your own setup in three places

Sources

Frequently asked questions

If I'm using free ChatGPT for client work, what's actually happening to that data?

Is Claude actually safer than ChatGPT for an SME?

I have already pasted sensitive things in. What now?

Ready to talk it through?

Related reading

The audit trail an SME actually needs, and the one it does not

Your minimum viable AI policy as a small business

Serving customers across borders, the multi-jurisdiction AI compliance picture for SMEs

If any of this sounds familiar, let's talk.