The vendor described it as safe. A compact AI model running on the firm’s own servers, processing invoices and drafting compliance summaries with no data leaving the building. The finance director liked the logic: private infrastructure, controlled environment, no cloud exposure. The compliance officer then asked the question nobody had answered. If this model produces output that touches a client account or a regulated process, who is accountable for that output?
That question is the one the FCA, ICO, and NCSC are all shaping their guidance around right now.
What is a compact model in a finance context?
A compact model (sometimes called a small language model) is an AI system with far fewer parameters than a frontier model like GPT-4. In practice, it can run on a firm’s own servers rather than sending data to an external provider. For finance teams, the appeal is keeping client data inside the firm’s own infrastructure while still getting AI help with document-heavy tasks.
Size is a cost and latency variable, not a governance one. Compact models built for domain-specific tasks can be highly capable at classification, extraction, and text generation within a narrow scope. Some are designed specifically for finance applications: expense categorisation, ledger reconciliation notes, policy Q&A over internal documents. What actually matters from a governance standpoint is what the model does with the data it sees, and who takes responsibility for the output.
That distinction matters because the UK’s regulatory framework for AI in financial services contains no size exemption.
Why does this matter for regulated firms?
The FCA, ICO, and NCSC have all issued AI guidance since 2023, and none of it scales down its expectations based on model size. The FCA’s discussion paper on AI in financial services focuses on governance, accountability, and consumer harm. The ICO’s 2024 generative AI guidance focuses on lawful basis, human oversight, and data protection impact assessments. Neither cares how small the model is.
The FCA has been direct on accountability: using a third-party model or outsourcing to a vendor does not transfer responsibility away from the authorised firm. The FCA’s Principles for Businesses apply regardless. The ICO adds that where AI processes personal data in high-risk ways, firms must complete a Data Protection Impact Assessment before deployment. In finance, that obligation covers AI used in credit processing, complaints handling, customer profiling, and fraud detection.
The NCSC’s guidance adds a further dimension. A model running on your own servers is not automatically safer than one accessed via an API. Prompt injection, weak access controls, and unlogged outputs are risks that live at the infrastructure level. Running a model internally is an operational choice; it is not a substitute for the security controls the NCSC expects. Treating private deployment as sufficient control is one of the clearest mistakes firms make when rolling out AI in regulated workflows.
Where will you actually meet compact models in finance work?
The use cases with the best track record are assistive, not decisive. Invoice and expense classification, policy and procedure lookup, first-draft compliance summaries, meeting notes with human review, internal Q&A over a restricted knowledge base. These work because the model helps without making the final call. That fits what the FCA, ICO, and NCSC all emphasise: accountability, transparency, and human oversight of outputs.
The Bank of England has long recognised that financial reporting is burdensome for smaller firms. A compact model that extracts, classifies, and routes routine information from documents can reduce that load, provided the firm validates outputs and maintains an audit trail. UK Finance and the government’s 2025 Financial Services Growth and Competitiveness Strategy both point towards AI-enabled productivity as a priority for the sector. The direction is towards safe experimentation, and safe experimentation requires governance infrastructure to match.
The practical architecture that works well here is retrieval-augmented generation (RAG). The model does not answer from its general training data; it retrieves from a set of firm-approved documents, then generates a response grounded in those sources. For a compliance team, that means traceable, reviewable outputs rather than plausible-sounding answers with no basis in the firm’s actual policies. This pattern, constrained retrieval plus human review, is more defensible under UK governance expectations than open-ended chat over live client records.
When does a compact model make sense, and when should you step back?
The practical divide is between assistive AI and decisive AI. Assistive use cases include drafting, summarising, classifying, and searching. Decisive use cases include creditworthiness assessments, suitability determinations, claims denials, and suspicious activity report triage. The FCA and ICO apply much stronger governance expectations to the second group, and a compact model cannot substitute for the human sign-off those decisions require.
The commercial case is strongest where AI removes repetitive manual steps, not where it replaces professional judgement. For many SME services firms, the likely gains are minutes saved per document, faster turnaround on routine queries, and better consistency on first drafts. The risks arrive when change control is weak, when staff use unsanctioned tools on regulated work, or when nobody can identify who reviews and approves outputs before they affect a client.
The practical move before deploying any compact model in a regulated workflow is to map each use case onto one of two columns: assistive (model helps, human decides) or decisive (model output triggers or informs a consequential action). Any use case in the second column needs a documented review process, a named reviewer, and a clear policy on how model-generated output is recorded and challenged. That mapping exercise costs nothing and resolves many of the governance questions before deployment begins.
What related concepts should you understand before deploying?
Three concepts come up regularly when finance teams and regulated firms start deploying compact models. Retrieval-augmented generation (RAG) is the architecture that makes constrained, citable AI practical in finance work. Data protection impact assessments (DPIAs) are what the ICO expects before AI deployment in high-risk contexts. The EU AI Act also creates binding obligations for UK firms with EU clients or group exposure.
RAG matters because it constrains where the model looks for answers. It retrieves from a curated document set you control, then generates a response grounded in those documents. For a compliance team, that means traceable outputs rather than hallucinated versions of internal policies.
DPIAs are increasingly relevant as finance AI use scales. The ICO’s guidance is clear: where AI is used in high-risk contexts including credit, insurance, or employment-related decisions, a DPIA is expected before deployment begins.
The EU AI Act is worth reviewing even for UK-only operations. If the firm serves EU-based clients, processes data on behalf of EU entities, or operates within a group with EU exposure, the Act’s high-risk AI rules may apply. Those rules include mandatory conformity assessments and registration requirements for certain finance-related applications, and they are stricter than the UK’s current framework. A UK firm that assumes its domestic compliance position covers EU exposure is taking on risk it may not have mapped.
If you want to talk through where your firm sits on the assistive-versus-decisive line, Book a conversation.



