The question comes up often with owners preparing to deploy AI tools. An IT consultant or data protection adviser asks whether they have considered data minimisation before the new rollout. The owner says yes. Then they put the phone down and type the phrase into a search engine.
What comes back is Article 5(1)(c), a wall of legal text, and several explainers written for compliance teams. None of it answers the practical question: what does this actually change in a 15-person consultancy on a Tuesday?
That is what this post covers.
What is data minimisation?
Under UK GDPR, data minimisation means you only collect and keep personal data that is genuinely necessary for a specific, defined purpose. The ICO uses three tests: is each data item adequate (enough to do the job), relevant (rationally linked to the purpose), and limited (could you do the job with less)? If any test fails, you should not hold it.
In practice, it rules out the collect-it-all-just-in-case instinct. If your client intake form asks for a date of birth and your service has nothing to do with age, that field fails the relevant test. If your CRM holds ten years of contact history for people who never became clients, much of that fails the limited test. The ICO’s guidance is direct: identify the minimum you actually need, hold that, and delete the rest.
Why does it matter for your business?
The legal exposure is real. UK GDPR carries fines of up to £17.5 million or 4% of global annual turnover for serious infringements of core data protection principles. The ICO has used those powers: British Airways was fined £20 million in 2020 after a cyberattack exposed the data of around 400,000 customers, and Marriott International was fined £18.4 million that year for a breach reaching 339 million guest records.
Neither fine was primarily for excessive data collection, but the volume of data exposed in both cases amplified the harm and the regulatory response. The NCSC’s position is plain: attackers cannot steal what you do not hold. IBM’s 2023 Cost of a Data Breach report found the global average breach cost was $4.45 million, a 15% rise over three years. Organisations with strong data governance consistently incurred lower costs when incidents occurred.
The Clearview AI enforcement action in 2022 illustrates the minimisation failure mode in its starkest form. The ICO fined the company £7.5 million and ordered it to delete all UK residents’ data. Clearview had scraped billions of facial images from the internet with no lawful basis and no defined purpose for holding them. The individuals affected had no relationship with the company at all. That is the collect-it-all instinct taken to its logical end.
The AI layer adds a pressure the ICO has addressed directly. Its guidance on AI and data protection makes clear that minimisation is not optional for AI projects. If you cannot justify why a particular data item is necessary for a specific AI task, you should not be processing it. The European Data Protection Board’s Article 25 guidelines extend this to system design: minimisation is built in from the start, through default settings and architecture, rather than added afterwards.
Where will you actually meet it?
For a services firm with 5 to 50 people, data minimisation applies to more systems than owners typically expect. Your CRM, HR files, invoices, call recordings, email marketing lists, and any AI tools that process personal data are all in scope. The rule applies whenever you handle information about identified or identifiable individuals, whether they are clients, leads, staff, or prospects.
The most common place owner-managed service firms over-collect is the lead form. A typical contact form asks for name, email, phone, company, and role. Many add postcode, company turnover, headcount, and industry. Unless those extra fields directly determine what service you offer, they fail the limited test and should not be there.
AI tool inputs are the newer pressure point. Many AI-assisted workflows, from email drafting to CRM enrichment to meeting summaries, involve feeding in personal data. For each workflow, the question is whether the AI genuinely needs that specific field to produce the output you want, or whether it is present because it happened to be in the file you exported. Many SaaS AI vendors now offer enterprise modes that prevent your data being used for model training. Turning those on where available reduces your exposure and your minimisation obligations simultaneously.
Retention defaults are the third place things go wrong. CRM and helpdesk systems typically hold everything indefinitely unless you configure them otherwise. Setting deletion or review dates per data category, and making sure your systems honour them, is the practical minimum. Monzo’s publicly documented approach, deleting or anonymising data once its regulatory retention period expires, is a proportionate model that scales down cleanly for a small service firm.
When does minimisation apply, and when can you ease off?
Minimisation does not mean deleting everything you are not actively using this week. The rule has sensible limits. HMRC expects businesses to keep accounting records for up to six years, and the ICO accepts that you may need to retain some data through legal limitation periods to defend potential claims. Completely anonymous data, where no individual can be re-identified, falls outside UK GDPR entirely.
Pseudonymised data is different. Where a code or identifier could be re-linked to an individual, it remains personal data and remains subject to minimisation. This matters for AI systems that reference customer IDs or session tokens in their inputs or logs.
There is also a legitimate edge case around AI model quality. The EDPB notes that “adequate” means holding enough data to do the job properly. Stripping demographic fields from equality monitoring data, for example, can introduce bias rather than reduce risk. The principle is proportionality: collect what is genuinely necessary, no more, but recognise that “necessary” sometimes means more than the bare minimum.
For a small service firm, though, the default should still be to collect less. The failure mode that creates regulatory and cyber exposure is over-collection and over-retention, not the reverse.
What else connects to this principle?
Data minimisation sits alongside three other GDPR principles that reinforce it. Storage limitation says you cannot keep personal data in identifiable form longer than necessary for your purposes. Purpose limitation says you may only use data for the purpose it was originally collected for. Article 25, privacy by design, requires you to build minimisation into your systems from the start rather than treating it as a retrofit.
The EU AI Act, being phased in from 2026, adds another layer for firms with operations or clients in EU markets. High-risk AI systems must implement data governance measures including relevance and minimisation in training data. If you are building or deploying AI tools for EU clients or staff, this is worth tracking now, not at implementation.
For a 5 to 50 person service firm, the practical starting point is a short data audit: for each system that holds personal data, confirm the purpose of each field, identify what could be deleted or anonymised, and set a review date. The ICO’s guidance, the NCSC’s security advice, and the FCA’s comparable requirements for financial services firms all point to the same conclusion: smaller data sets are easier to protect, cheaper to manage, and significantly safer if something goes wrong. The compliance case and the cyber security case are, in this instance, the same argument.



