Data classification in regulated financial services firms

Two professionals reviewing documents at an office desk, one gesturing towards a laptop screen
TL;DR

Data classification is the process of sorting your data into sensitivity tiers and applying specific handling rules to each. In regulated UK financial services it is the foundation for every security and compliance decision: the FCA and Bank of England both run formal classification schemes, and the ICO ties classification directly to data minimisation and accountability obligations. An owner-managed firm needs three to four levels, clear handling rules, and a real link to access controls to make it work in practice.

Key takeaways

- Data classification is the organising principle that links regulatory compliance, security controls, and AI adoption decisions in a regulated financial services firm. - IBM's 2023 research found that organisations with extensive data classification tools had breach costs on average USD $1.03m lower than those without, a 20.6% reduction. - Both the FCA and Bank of England operate formal four-tier classification schemes and expect third parties handling their data to apply equivalent discipline. - A practical starting point for an owner-managed firm is three to four levels with clear handling rules covering storage, access, transmission, and retention for each. - The most common failure modes are paper-only schemes without linked controls, and ungoverned SaaS tools where sensitive data accumulates without any classification at all.

Picture a small financial planning firm with eight advisers, a compliance officer who wears several hats, and fifteen years of client data spread across a CRM, a shared drive, email, and three SaaS tools adopted over the past few years. Someone suggests using an AI assistant to summarise client meeting notes. The question nobody can answer confidently is which of that data is safe to feed into the tool, and which would create a regulatory problem if it ended up somewhere it should not.

That question is exactly what data classification is built to answer.

What is data classification?

Data classification is the process of sorting your data into defined sensitivity tiers and applying specific handling rules to each. In practice, it means labelling information as public, internal, confidential, or restricted, and writing clear rules about who can access each level, where it can be stored, and how it must be transmitted. The classification scheme is the organising map all other security and compliance decisions hang from.

For an owner-managed firm, three or four levels is typically enough. UK-focused guidance from the NCSC and sector-specific IT advisors consistently recommends a simple scheme: public, internal, confidential, and restricted or highly confidential. Each level carries its own handling rules covering access, storage location, encryption requirements, and retention periods.

The classification does not stop at the label itself. Attaching a “Confidential” tag to a file folder does nothing useful if there are no linked controls behind it. Classification becomes operational when the label determines what actually happens next: which tools can process the data, who can share it externally, and what response is triggered if it is lost or accessed without authorisation. A scheme that lives only in a policy document rather than in the configuration of your systems is a compliance risk waiting to surface.

Why does it matter more in financial services?

Financial services firms hold data that carries a disproportionate cost when it goes wrong. Customer PII, payment details, KYC documents, trading records, and regulated communications all sit under overlapping regimes: UK GDPR via the ICO, FCA Principles and SYSC rules, PRA oversight, and NCSC cybersecurity expectations. IBM’s 2023 Cost of a Data Breach Report found financial services had the second-highest average breach cost globally at USD $5.90m, roughly £4.7m.

The regulatory enforcement record reinforces the point. Tesco Bank was fined £16.4m by the FCA following a cyber attack that exploited weaknesses in customer debit card controls. British Airways was fined £20m by the ICO for failing to protect login, payment card, and booking data. Marriott International was fined £18.4m for not properly understanding or securing data in an acquired reservation system. In each case the regulator’s finding was not only that the firm had suffered a breach. The finding was that controls were not proportionate to the sensitivity of the data the firm held.

The FCA’s Dear CEO letters on operational resilience and cyber risk have cited weak information classification as a gap on multiple occasions since 2021. The ICO’s accountability framework requires organisations to demonstrate controls appropriate to the risk of their data. Classification is the mechanism that makes “proportionate controls” a defensible claim rather than an aspiration.

Where will you actually meet it in a regulated firm?

You will meet data classification requirements in three places: your regulator’s expectations, your suppliers’ contractual requirements, and the operational decisions you make every day about where to store and share information. The FCA and Bank of England both operate formal four-tier classification schemes for their own data, and third parties handling that data are expected to mirror the same discipline.

The Bank of England’s third-party standard uses four levels: Public, Official, Official-Sensitive, and Secret. Each level carries specific rules on access control, storage, encrypted transmission, and incident reporting requirements. The FCA runs a parallel scheme with four categories of its own, covering different transmission and disposal requirements at each level. These are not aspirational frameworks. Suppliers and technology partners selling into UK financial institutions are increasingly asked to demonstrate that their products handle data in line with those classification and handling standards.

For an owner-managed financial services firm, the practical encounter with classification arrives in three moments: when you adopt a new SaaS tool and need to decide what data goes into it, when a regulatory review or client due-diligence exercise asks about your data risk controls, and when an AI tool enters the picture. Each moment requires a working answer to the same question: which of our data is sensitive, where does it currently live, and who can reach it?

When is a formal scheme worth the effort, and when is it overkill?

The honest answer depends on how much data you hold and how many systems it sits across. An owner-managed firm running its entire operation through a single FCA-approved platform and keeping minimal local storage will gain more from tightening that platform’s own configuration than from building a four-level written scheme. The ICO’s accountability framework is explicit: controls should be appropriate to the risk, not maximal by default.

The calculus shifts as complexity grows. A ten-person advisory firm with client data across a CRM, file storage, email, and several cloud-based tools has genuine classification exposure. IBM’s research found that organisations with extensive use of data classification and data discovery tools had breach costs on average USD $1.03m lower than those without, a reduction of roughly 20% against the 2023 global average. The cost of getting it wrong is not theoretical.

Over-classification is a real failure mode too. If everything is labelled “Restricted”, staff route around the labels and security weakens in practice. Ponemon Institute’s 2023 Global Data Risk Report found that 76% of organisations have more than one million files accessible to every employee, often because access was never restricted in the first place. Three or four levels, applied consistently and enforced through system configuration and access controls, is substantially more effective than six levels on paper with no link to how the systems actually behave.

How does data classification connect to AI adoption and operational resilience?

Data classification is the common thread between three things that sit at the top of the UK regulatory agenda for financial services firms: operational resilience planning, AI governance, and access control. Once you know which of your data is sensitive, you can make defensible decisions about where AI tools can reach, which services are critical to the business, and who inside the firm genuinely needs access to what.

On AI specifically, the NCSC’s guidance on using public generative AI safely recommends that organisations classify data and prohibit feeding sensitive financial and customer data into public AI tools. The ICO’s 2023 generative AI guidance adds that prompts and outputs involving identifiable individuals should be treated as personal data under UK GDPR. Classification gives you the framework to act on both recommendations: if client account records are classified as Confidential, the handling rule can state plainly that they must not be entered into unmanaged AI tools.

On operational resilience, the Bank of England and FCA’s joint policy requires firms to identify their important business services and map the data assets supporting them. That mapping is classification work, whether or not it is called that. On access control, classification is the prerequisite for tightening over-permissive permissions. You cannot apply role-based access controls sensibly until you know what data is in each system and how sensitive it is.

A proportionate starting point for a firm without a scheme: define three or four levels and write the handling rules for each, take stock of your key systems and identify the highest-sensitivity data present in each one, and link the classification to at least one concrete control per level, whether that is multi-factor authentication, encryption in transit and at rest, or a written rule about which tools can process that tier of data. Reviewed annually, that is a defensible foundation that satisfies the regulator’s proportionality test and gives you a clear answer the next time someone asks about feeding client data into an AI tool.

Sources

- IBM (2023). Cost of a Data Breach Report 2023. Financial services had the second-highest average breach cost globally at USD $5.90m; organisations with extensive classification tools had breach costs USD $1.03m lower on average. https://www.ibm.com/reports/data-breach - ICO (2024). Guide to the UK GDPR: Principles. Sets out data minimisation and integrity/confidentiality obligations that underpin classification requirements for all UK data controllers. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/guide-to-the-uk-gdpr/principles/ - ICO (2024). Accountability Framework. Requires organisations to demonstrate controls proportionate to the risk of data held; classification is one of the clearest accountability mechanisms available. https://ico.org.uk/for-organisations/accountability-framework/ - Bank of England (2024). Information Security Classification Scheme Standard for Third Parties. Sets a four-tier scheme (Public, Official, Official-Sensitive, Secret) with specific handling rules that third parties handling Bank data must follow. https://www.bankofengland.co.uk/-/media/boe/files/about/human-resources/iscs-external-guidance.pdf - FCA (2024). FCA classified information. Describes the FCA's own four-level classification scheme and the storage, transmission, and disposal rules for each level. https://www.fca.org.uk/legal-information/fca-classified-information - NCSC (2024). Cloud security guidance: principles. Recommends a formal information classification scheme to determine where data can be stored and what technical controls are required per sensitivity tier. https://www.ncsc.gov.uk/collection/cloud-security - NCSC (2023). Using public generative AI safely. Recommends organisations classify data and prohibit feeding sensitive financial and customer data into public AI tools. https://www.ncsc.gov.uk/guidance/using-public-generative-ai-safely - ICO (2023). Generative AI: Data protection and privacy considerations. Stresses data minimisation and restricting personal data input into generative AI models under UK GDPR. https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/artificial-intelligence/generative-ai/ - FCA (2018). Tesco Personal Finance plc, Final Notice 1 October 2018. £16.4m fine following a cyber attack exploiting weaknesses in customer account data controls; illustrates the FCA's scrutiny of data risk management proportionality. https://www.fca.org.uk/publication/final-notices/tesco-personal-finance-plc-2018.pdf - Varonis / Ponemon Institute (2023). Global Data Risk Report 2023. Found 76% of organisations have more than one million files accessible to every employee, illustrating the scale of over-permissive access where classification and access controls are absent. https://www.varonis.com/blog/global-data-risk-report

Frequently asked questions

What data classification levels should a small financial services firm use?

Three or four levels work well for an owner-managed firm: public, internal, confidential, and restricted. Each level needs clear handling rules covering where data can be stored, who can access it, how it must be transmitted, and how long it is kept. Four levels gives enough granularity to distinguish between marketing materials and client account records without creating so many categories that staff stop applying them consistently.

Does UK GDPR require data classification?

UK GDPR does not mandate a specific classification scheme, but it does require data minimisation, integrity and confidentiality, and accountability, which together push regulated firms towards classification in practice. The ICO's accountability framework asks organisations to demonstrate controls appropriate to the risk of the data they hold. A classification scheme is one of the clearest ways to show you have assessed that risk and applied proportionate controls.

Can I use AI tools with client data if I have a classification scheme in place?

A well-built classification scheme will clarify what is safe to use with AI tools and what is off-limits. As a working rule, anything classified as confidential or above, including client account details, KYC documents, and payment records, should not be entered into unmanaged public AI tools. The NCSC and ICO both recommend using data classification to set explicit boundaries on AI tool access, particularly for generative AI systems.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation