What data minimisation means under GDPR, in plain English

Business owner reviewing documents at a desk with a laptop open and natural light from a window
TL;DR

Under UK GDPR, data minimisation means you collect and retain only the personal data that is genuinely necessary for a specific, defined purpose. Nothing extra, nothing held just in case. The ICO applies three tests: adequate, relevant, and limited. For any owner-managed services firm deploying AI tools, this applies directly to what you feed into those tools, what they store, and how long you keep the outputs.

Key takeaways

- Data minimisation under UK GDPR means collecting only what is adequate, relevant, and limited to what you need for a specific purpose: nothing extra, nothing held just in case. - The ICO expects you to map what you collect, confirm why each field is necessary, and delete or anonymise data you no longer need. - British Airways (£20 million fine) and Marriott (£18.4 million fine) show the scale of regulatory exposure when large volumes of personal data are poorly governed. - AI tools do not exempt you from minimisation rules: the ICO's AI guidance says the same three tests apply to what you feed into models and what those models store. - Regulatory retention duties, such as HMRC's six-year expectation for business records, and legal limitation periods are legitimate reasons to hold data beyond immediate use, but they are specific and time-bounded, not a licence to keep everything.

The question comes up often with owners preparing to deploy AI tools. An IT consultant or data protection adviser asks whether they have considered data minimisation before the new rollout. The owner says yes. Then they put the phone down and type the phrase into a search engine.

What comes back is Article 5(1)(c), a wall of legal text, and several explainers written for compliance teams. None of it answers the practical question: what does this actually change in a 15-person consultancy on a Tuesday?

That is what this post covers.

What is data minimisation?

Under UK GDPR, data minimisation means you only collect and keep personal data that is genuinely necessary for a specific, defined purpose. The ICO uses three tests: is each data item adequate (enough to do the job), relevant (rationally linked to the purpose), and limited (could you do the job with less)? If any test fails, you should not hold it.

In practice, it rules out the collect-it-all-just-in-case instinct. If your client intake form asks for a date of birth and your service has nothing to do with age, that field fails the relevant test. If your CRM holds ten years of contact history for people who never became clients, much of that fails the limited test. The ICO’s guidance is direct: identify the minimum you actually need, hold that, and delete the rest.

Why does it matter for your business?

The legal exposure is real. UK GDPR carries fines of up to £17.5 million or 4% of global annual turnover for serious infringements of core data protection principles. The ICO has used those powers: British Airways was fined £20 million in 2020 after a cyberattack exposed the data of around 400,000 customers, and Marriott International was fined £18.4 million that year for a breach reaching 339 million guest records.

Neither fine was primarily for excessive data collection, but the volume of data exposed in both cases amplified the harm and the regulatory response. The NCSC’s position is plain: attackers cannot steal what you do not hold. IBM’s 2023 Cost of a Data Breach report found the global average breach cost was $4.45 million, a 15% rise over three years. Organisations with strong data governance consistently incurred lower costs when incidents occurred.

The Clearview AI enforcement action in 2022 illustrates the minimisation failure mode in its starkest form. The ICO fined the company £7.5 million and ordered it to delete all UK residents’ data. Clearview had scraped billions of facial images from the internet with no lawful basis and no defined purpose for holding them. The individuals affected had no relationship with the company at all. That is the collect-it-all instinct taken to its logical end.

The AI layer adds a pressure the ICO has addressed directly. Its guidance on AI and data protection makes clear that minimisation is not optional for AI projects. If you cannot justify why a particular data item is necessary for a specific AI task, you should not be processing it. The European Data Protection Board’s Article 25 guidelines extend this to system design: minimisation is built in from the start, through default settings and architecture, rather than added afterwards.

Where will you actually meet it?

For a services firm with 5 to 50 people, data minimisation applies to more systems than owners typically expect. Your CRM, HR files, invoices, call recordings, email marketing lists, and any AI tools that process personal data are all in scope. The rule applies whenever you handle information about identified or identifiable individuals, whether they are clients, leads, staff, or prospects.

The most common place owner-managed service firms over-collect is the lead form. A typical contact form asks for name, email, phone, company, and role. Many add postcode, company turnover, headcount, and industry. Unless those extra fields directly determine what service you offer, they fail the limited test and should not be there.

AI tool inputs are the newer pressure point. Many AI-assisted workflows, from email drafting to CRM enrichment to meeting summaries, involve feeding in personal data. For each workflow, the question is whether the AI genuinely needs that specific field to produce the output you want, or whether it is present because it happened to be in the file you exported. Many SaaS AI vendors now offer enterprise modes that prevent your data being used for model training. Turning those on where available reduces your exposure and your minimisation obligations simultaneously.

Retention defaults are the third place things go wrong. CRM and helpdesk systems typically hold everything indefinitely unless you configure them otherwise. Setting deletion or review dates per data category, and making sure your systems honour them, is the practical minimum. Monzo’s publicly documented approach, deleting or anonymising data once its regulatory retention period expires, is a proportionate model that scales down cleanly for a small service firm.

When does minimisation apply, and when can you ease off?

Minimisation does not mean deleting everything you are not actively using this week. The rule has sensible limits. HMRC expects businesses to keep accounting records for up to six years, and the ICO accepts that you may need to retain some data through legal limitation periods to defend potential claims. Completely anonymous data, where no individual can be re-identified, falls outside UK GDPR entirely.

Pseudonymised data is different. Where a code or identifier could be re-linked to an individual, it remains personal data and remains subject to minimisation. This matters for AI systems that reference customer IDs or session tokens in their inputs or logs.

There is also a legitimate edge case around AI model quality. The EDPB notes that “adequate” means holding enough data to do the job properly. Stripping demographic fields from equality monitoring data, for example, can introduce bias rather than reduce risk. The principle is proportionality: collect what is genuinely necessary, no more, but recognise that “necessary” sometimes means more than the bare minimum.

For a small service firm, though, the default should still be to collect less. The failure mode that creates regulatory and cyber exposure is over-collection and over-retention, not the reverse.

What else connects to this principle?

Data minimisation sits alongside three other GDPR principles that reinforce it. Storage limitation says you cannot keep personal data in identifiable form longer than necessary for your purposes. Purpose limitation says you may only use data for the purpose it was originally collected for. Article 25, privacy by design, requires you to build minimisation into your systems from the start rather than treating it as a retrofit.

The EU AI Act, being phased in from 2026, adds another layer for firms with operations or clients in EU markets. High-risk AI systems must implement data governance measures including relevance and minimisation in training data. If you are building or deploying AI tools for EU clients or staff, this is worth tracking now, not at implementation.

For a 5 to 50 person service firm, the practical starting point is a short data audit: for each system that holds personal data, confirm the purpose of each field, identify what could be deleted or anonymised, and set a review date. The ICO’s guidance, the NCSC’s security advice, and the FCA’s comparable requirements for financial services firms all point to the same conclusion: smaller data sets are easier to protect, cheaper to manage, and significantly safer if something goes wrong. The compliance case and the cyber security case are, in this instance, the same argument.

Sources

- ICO (2023). Principle (c): Data minimisation, UK GDPR guidance. The ICO's authoritative explanation of adequate, relevant and limited, with worked examples. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/data-protection-principles/a-guide-to-the-data-protection-principles/data-minimisation/ - GDPR.eu. Article 5 GDPR, principles relating to processing of personal data. The legislative text establishing minimisation as a core principle alongside accuracy, storage limitation and security. https://gdpr-info.eu/art-5-gdpr/ - ICO (2020). ICO fines British Airways £20m for data breach. Enforcement action following the 2018 breach exposing personal and financial data of approximately 400,000 customers. https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2020/10/ico-fines-british-airways-20m-for-data-breach/ - ICO (2020). ICO fines Marriott International £18.4 million for failing to keep customers' personal data secure. Enforcement action following a breach affecting approximately 339 million guest records. https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2020/10/ico-fines-marriott-international-inc-18-4million-for-failing-to-keep-customers-personal-data-secure/ - ICO (2022). ICO fines Clearview AI Inc £7.5m and orders deletion of UK residents' data. Case illustrates minimisation failure: collection of billions of facial images with no specific lawful purpose. https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2022/05/ico-fines-clearview-ai-inc-7-5m/ - ICO (2023). Guidance on AI and data protection. Sets out the ICO's position that data minimisation applies to AI inputs, stored data, and outputs, and is not optional for AI projects. https://ico.org.uk/for-organisations/guide-to-data-protection/key-dp-themes/artificial-intelligence/ - European Data Protection Board (2020). Guidelines 4/2019 on Article 25, Data Protection by Design and by Default. Requires controllers to build minimisation into system architecture and default settings, not retrofit it. https://edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-42019-article-25-data-protection-design-and_en - NCSC. Cloud security guidance. NCSC advice that reducing data collection and retention directly reduces the impact of compromise. https://www.ncsc.gov.uk/collection/cloud-security - IBM (2023). Cost of a Data Breach Report 2023. Global average breach cost of $4.45 million, a 15% rise over three years; organisations with strong data governance consistently incurred lower costs. https://www.ibm.com/reports/data-breach - HMRC. Record keeping for limited companies and businesses. Sets the six-year retention expectation for business and accounting records relevant to the minimisation limit-case. https://www.gov.uk/running-a-limited-company/company-and-accounting-records

Frequently asked questions

Does data minimisation mean I have to delete everything I am not currently using?

No. Minimisation is about not collecting more than you need in the first place, and not keeping it longer than a defined purpose requires. Regulatory obligations such as HMRC's six-year record-keeping expectation, and the need to retain data through legal limitation periods, are legitimate reasons to hold data. The key is that retention should be purposeful and time-bounded, not indefinite by default.

Does data minimisation apply to the information I feed into AI tools?

Yes. The ICO's guidance on AI and data protection is explicit: minimisation applies to what you input into AI systems, what those systems store, and what you keep afterwards. If you cannot justify why a particular data field is necessary for the specific AI task, you should not be processing it. Many SaaS AI vendors now offer no-training enterprise modes that prevent your data being used to train models, which reduces your exposure further.

We are a very small firm. Does GDPR data minimisation really apply to us?

UK GDPR applies to any organisation that processes personal data about identified or identifiable individuals, with very limited exceptions. The personal household exemption covers private address books and similar, but not commercial activity of any size. A five-person services firm holding client contact records, staff HR files, or marketing lists is subject to the same minimisation principle as a large corporation, though the proportionate steps to comply are considerably simpler.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation