How to prepare your business data for AI

A colleague mentions at a networking event that their team has started using Copilot to draft client reports. They are saving two hours a week. Then their IT support provider asks whether the SharePoint permissions have been reviewed lately. Silence. The data the AI was drawing on was accessible to the whole firm, including a contractor who had left eight months earlier.

The scenario is more common than you might expect. Many owner-managed services firms start using AI tools in a state of inherited disorder. Records are inconsistently entered, ownership is unclear, access controls were set up once and never revisited. The tools work well on clean, well-governed data. On messy, over-accessible data they produce unreliable outputs and create new risks. Getting your data ready is the foundation that determines whether your AI investment pays off.

What does “data ready for AI” actually mean?

“AI-ready data” is business data that has been inventoried, cleaned, secured, and documented well enough to use with AI tools without exposing your firm to legal, security, or quality risk. The UK government’s AI-ready datasets guidance breaks this into four pillars: technical quality, documentation, organisational infrastructure, and legal and ethical compliance. For a small services firm, those pillars map to a four-step sequence of practical work.

The UK government’s framework points to ISO/IEC 8183, the international standard for AI data lifecycles, as a reference. The core idea there is that data readiness is a continuous discipline rather than a project you complete once. A clean-up before your first AI pilot will help initially. Without clear ownership and regular maintenance, data quality degrades and the AI tools depending on it become progressively less reliable.

Why does this matter for a small services business?

The ICO’s guidance on AI and data protection is clear. Any organisation using AI to process personal data needs a lawful basis, must consider a Data Protection Impact Assessment for high-risk uses, and must be able to explain how the AI uses people’s data. That applies to a ten-person consultancy as much as to a bank. Failure to prepare creates real legal exposure.

The numbers give you a sense of the stakes. The ICO fined British Airways £20m in 2020 following a data breach caused by security failings. In 2019, Bounty (UK) Limited was fined £400,000 for sharing the personal data of 14 million people with third parties without adequate transparency, establishing that secondary uses of customer data, including for analytics, require clear notice and a lawful basis.

The FCA’s 2023 survey of 73 UK financial firms found that 72% reported using or developing machine-learning applications, but many cited data quality and availability as key barriers to deploying them safely. That gap between intention and safe deployment is what data preparation is designed to close.

The NCSC adds a practical warning. Data submitted to online AI tools may be stored or used to improve those services. The NCSC recommends reviewing provider policies before sharing any sensitive business data with an AI-as-a-service tool.

Where do the problems show up in practice?

Data problems surface in predictable places once you start using AI tools. The commonest is inconsistent, duplicated, or missing records, such as client names entered differently across systems, dates in mixed formats, or older files with no clear owner. A second pattern is access-control drift, where data is readable by anyone in the firm even when only two people need it. Both problems limit what AI can do safely.

The first pass is an inventory. Build a simple spreadsheet listing each system, its data owner, the data types it holds, whether any is personal or special-category data (health, ethnicity, financial), and the business use. Government guidance recommends understanding where data comes from and how it flows before using it for AI. For a services firm, that exercise typically takes a few hours rather than a few weeks.

The second pass is cleaning and securing. Standardise formats, remove obvious duplicates, and apply role-based access control. The NCSC recommends strong authentication and role-based access for any data used by AI services. Government guidance describes encrypting data at rest and in transit as “non-negotiable” for AI-ready data. Both are standard security practices; AI makes them more pressing because AI tools synthesise information across everything they can reach.

The third pass is documentation and governance. Write a short internal note per use-case: data sources, the legal basis for using personal data, retention period, and who can access it. If a use-case involves profiling clients or automating decisions about individuals, a DPIA is required before you proceed.

The fourth pass is connecting to AI tools carefully. Enterprise plans for tools such as Microsoft Copilot for Microsoft 365 inherit your existing permissions structure and, per Microsoft’s documentation, do not train the underlying model on your tenant data. Consumer plans carry no such protections. Your access controls determine what the AI can see; getting them right before you connect is the step that makes enterprise AI tools safe to use.

When is a lighter-touch approach good enough?

Regulators take a risk-based approach. The ICO does not require a DPIA for every AI experiment; the trigger is processing “likely to result in a high risk” to individuals, such as profiling, large-scale data use, or automated decisions with significant effects. Using AI to summarise internal notes containing no personal data sits well below that threshold. Applying full governance to such tasks would slow you down without making anyone safer.

The dividing line is usually clear in practice. Using AI to generate a first draft from your own notes? A brief check that no client names have slipped in is enough. Using AI to profile prospects from your CRM, match clients to services, or automate onboarding communications? That sits in different territory, and the ICO’s automated decision-making guidance applies.

One counterpoint worth keeping in mind. Thorough data governance does not guarantee AI projects will deliver commercial value. The National Audit Office has repeatedly found that unclear business objectives and poor change management are equally common causes of digital and AI project failure. Getting your data right is a prerequisite. Solid data alongside unclear objectives will still produce a project that misses.

What else connects to data readiness?

Several concepts come up repeatedly once you start working on data readiness, and understanding them early prevents confusion. Role-based access control (RBAC) means assigning data access by job role rather than individual. Data minimisation means collecting and retaining only what you genuinely need, a key ICO expectation. The EU AI Act is adding compliance timelines that UK firms selling into Europe need to plan for.

RBAC matters beyond AI specifically. Your CRM, cloud storage, and project management tools should all have explicit role assignments so that new starters get access to what their role requires, and departing contractors lose it cleanly. AI amplifies the importance of this because AI tools synthesise information across everything they can reach. The narrower the access, the lower the risk.

Data minimisation is a principle in UK GDPR rather than a preference. Collecting less personal data than you think you need is almost always the right call. It reduces your ICO exposure, simplifies the DPIA process if you go down that road, and means AI tools drawing on your data work from relevant information rather than generating inferences from data you never intended to use.

The EU AI Act sets requirements for organisations operating within the EU or supplying cross-border. UK firms are not directly subject to it post-Brexit, but those working with EU clients or selling into European markets may face its data governance obligations for high-risk AI systems. Core provisions apply 24 months after entry into force, with further requirements phased over 36 months. If you are growing cross-border, building your data governance now is simpler than retrofitting it later.

A data-readiness exercise pays twice over. It makes your AI tools more reliable today, and it de-risks your business against a regulatory landscape that is still taking shape. The four steps, inventory, clean, document, and connect, can be completed in a few focused days for a typical services firm. Start with the data your first AI use-case will actually touch, get that in order, and expand from there.

How to prepare your business data for AI tools

Key takeaways

What does “data ready for AI” actually mean?

Why does this matter for a small services business?

Where do the problems show up in practice?

When is a lighter-touch approach good enough?

What else connects to data readiness?

Sources

Frequently asked questions

Do I need to carry out a DPIA before using AI tools in my business?

Is it safe to paste client information into a business AI tool like Microsoft Copilot?

What is the biggest data preparation mistake small firms make when adopting AI?

Ready to talk it through?

If any of this sounds familiar, let's talk.

How to prepare your business data for AI tools

Key takeaways

What does “data ready for AI” actually mean?

Why does this matter for a small services business?

Where do the problems show up in practice?

When is a lighter-touch approach good enough?

What else connects to data readiness?

Sources

Frequently asked questions

Do I need to carry out a DPIA before using AI tools in my business?

Is it safe to paste client information into a business AI tool like Microsoft Copilot?

What is the biggest data preparation mistake small firms make when adopting AI?

Ready to talk it through?

Related reading

Find the shadow AI in your agency before a client's data leaks through it

A four-tier data map so your team knows what AI can touch

Capture the shop-floor knowledge before it retires

If any of this sounds familiar, let's talk.