Bot vs local capture, why client calls aren't a place for Otter

A consultant on a video call at a desk with a laptop open and a notebook with handwritten action items beside a phone face-down
TL;DR

Bot-based meeting tools (Fireflies, Otter, Fathom, Avoma) are excellent for internal meetings and consistent for CRM-integrated sales calls, but create real client discomfort in professional services calls where the client expects a private, focused conversation. Local-capture tools (Granola, Jamie) record from the host's device with no visible bot, which keeps the meeting feel intact at the cost of single-source audio. The right choice depends on the meeting type, not on the feature comparison.

Key takeaways

- Bots that join visibly create client discomfort in professional services calls. Local-capture tools (Granola, Jamie) record from the host's device with no visible bot, preserving the meeting feel. - Time savings are real and consistent across both categories: 4 to 6 hours of manual transcription drops to 5 to 15 minutes of review, plus 2 to 4 hours saved per meeting on action capture and summary. - Accuracy ceiling is 80 to 95 percent on English in good audio. AI does not capture intent, judgement, or what the client actually needs. The post-meeting interpretation step still requires the practitioner. - SME tool sweet spot: Granola or Jamie at £15 to £30 per user per month for external client calls, native Teams or Zoom or Google Meet recording for internal. Total cost for a 5 to 15 person firm typically £100 to £300 a month. - Compliance: SRA permits one-party-consent recording but explicit client consent is best practice. NHS Digital and GDPR require explicit patient consent. FCA-regulated calls often require recording. AI processing of recorded calls requires a DPA. - Implementation pattern: deploy local capture for all calls, ask explicit client consent on external calls, give the option to decline, build a unified meeting-summary archive over six months.

A 20-person consulting firm deployed Fireflies across all calls last quarter. Within four weeks, two clients had explicitly asked not to record. One renegotiated the engagement scope to exclude AI tooling. Internally, the team reported a subtle change: staff felt they were performing on calls, choosing words more carefully, the conversation flatter. The firm switched external calls to Granola for local capture and kept Fireflies for internal meetings. Both quiet wins, properly separate.

This is the meeting AI mistake that does not show up in vendor demos. The time savings are real. The category most owners reach for first (bot-based, easy setup, deep CRM integration) is the wrong choice for the calls that matter most. The fix is a different tool for a different meeting type, not a better tool overall.

What is the difference between bot and local capture?

Bot-based meeting tools (Fireflies, Otter, Fathom, Avoma) join the video call as a visible participant. The bot appears in the participant list, often introduces itself, and records the conversation from the platform's perspective. Local-capture tools (Granola, Jamie) record audio from the host's device without joining the call. The other participants see only the people they are talking to. The host knows the recording is happening. Nobody else does, unless told.

The category split shows up clearly in the time-saving numbers. Both deliver similar results: 4 to 6 hours of manual transcription drops to 5 to 15 minutes of review for a one-hour meeting; 15 to 30 minutes of action-item extraction drops to 2 to 5 minutes; net 2 to 4 hours saved per meeting in post-meeting overhead. The technology is comparable. The deployment friction is not.

For a consulting firm running 10 to 15 client meetings a week, that operational saving is large enough to fund the tool many times over. The category choice is what determines whether the saving lasts beyond the first month.

Why does the bot create client discomfort?

Clients notice when a bot joins. Most professional services clients expect the call to be a private, focused conversation. The visible bot signals "this conversation is being recorded by a third-party platform" in a way that human note-taking does not. Some clients ask explicitly not to record. Some say nothing and just become more guarded. Either response defeats the purpose of the tool: a guarded conversation is worse than an unrecorded one.

The reason the discomfort is rarely articulated is that clients often cannot name what feels off. They sense an additional presence and adjust accordingly. The advisor experiences a flatter conversation and a less productive call without realising the bot is the cause.

Local-capture tools sidestep the issue. The advisor's laptop records audio from the device's microphone, the platform sees only humans, and the client experiences the same call they would have had without AI. The recording quality drops slightly because only one device is capturing audio (the host's), but the conversation quality holds.

When does the bot category actually fit?

Internal meetings are the natural home for bot-based capture. Team stand-ups, internal project reviews, planning sessions. There is no client to discomfort. Audio quality is uniform across participants because everyone is on the same platform with similar setups. Action items, decisions, and follow-ups need to be tracked across multiple devices, which is exactly what bots are good at.

Sales calls integrated with a CRM are another fit. Tools like Fathom and Avoma push transcripts and action items directly into Salesforce or HubSpot deal records, which speeds up pipeline updates and means the sales lead does not have to retype meeting notes. The trade-off (visible bot in the call) is acceptable in many sales contexts because clients expect some record of the conversation in a CRM workflow.

Client-facing professional services calls are where the bot category fails. Confidential advisory work, sensitive engagement discussions, anything where the client is processing emotionally or making a strategic call. Local capture or no capture is the right answer in those contexts.

What is the realistic cost and saving?

For a 5 to 15 person services firm, the most economical setup is Granola or Jamie at £15 to £30 per user per month for external calls, plus native recording in Teams, Zoom, or Google Meet for internal meetings (usually included in productivity stack). Total tool spend is typically £100 to £300 a month for the firm.

The time savings are large enough that ROI lands in weeks. A 15-person consulting firm running 10 to 15 client meetings per week recovers 17 to 28 hours a week of post-meeting admin. At loaded consulting staff rates of £40 an hour, that is £36,000 to £59,000 a year. Tool cost £2,400 to £4,800 a year. Net annual benefit £31,000 to £55,000. Payback in 2 to 4 weeks.

For a 10-person legal practice running 5 to 8 client meetings a week, the numbers scale down but the ratio holds. 10 to 24 hours a week recovered, £20,000 to £50,000 annual benefit, £1,200 to £3,600 annual tool cost, payback in 1 to 3 weeks.

What does AI not do in meeting capture?

AI does not capture intent, judgement, or what the client actually needs. The transcript reflects what was said. The summary reflects what the AI extracted. Neither captures the moment when the client looked away before answering, or the question they did not ask, or the concern that came through in tone rather than words. Those signals still depend on the practitioner being present and noticing.

The post-meeting interpretation step is still real work. Practitioners spend 15 to 30 minutes reviewing the AI-generated summary, correcting misattributions, adding context the AI missed, and turning raw action items into something a client would recognise as the agreed plan. The AI compresses the mechanical layer. It does not replace the interpretation layer.

This is the framing that protects against disappointment. Owners who expect AI to remove the entire post-meeting overhead are surprised when the saved time is 80 percent rather than 100 percent. Owners who expect AI to remove the mechanical layer and free time for interpretation get exactly what they were promised.

Where do compliance gates land?

For legal practices, recording client calls without explicit consent may breach professional standards even where it is permissible under one-party-consent rules. The SRA does not prohibit recording, but firms must ensure recordings are handled securely and confidentiality is maintained. Explicit client consent is best practice, written into the engagement letter or asked at the start of the call.

For healthcare clinics, recording patient calls is regulated. NHS Digital governance plus UK GDPR require explicit patient consent before recording, a clear purpose for the recording, and a defined retention period (typically 6 to 12 months unless there is a clinical or legal reason to retain longer).

For financial services firms regulated by the FCA, customer call recording is often required for compliance, and many firms record without explicit notification under their existing regulatory framework. AI processing of recorded calls requires a Data Processing Agreement with the AI vendor.

The protocol is consistent across sectors: get explicit consent, deploy a tool with a DPA, review the data retention default, and document the decision. Local-capture tools and bot-based tools both meet these requirements when deployed properly.

If you are choosing between bot and local capture for the meetings that matter most in your firm, the choice is rarely about features. It is about which conversation you are protecting and which one you are documenting. Book a conversation.

Sources

  • Zackproser, best AI meeting notes 2026. Source.
  • YouTube case study, meeting notes deployment in services firms. Source.
  • Tess Group, AI compliance UK businesses 2026 guide. Source.
  • Law Society UK, compliance and use of AI in law firms. Source.
  • Brynjolfsson, E., Li, D. and Raymond, L. (2023). Generative AI at Work, NBER Working Paper 31161. Empirical productivity study showing 14 per cent average gain with 34 per cent for low-skilled workers, the basis for sector-specific AI productivity claims. Source.
  • McKinsey & Company (2024). From Promise to Impact, How Companies Can Measure and Realise the Full Value of AI. Five-layer measurement framework for evaluating sector AI deployments. Source.
  • Goldman Sachs (2023). Generative AI could raise global GDP by 7 per cent. Cross-sector productivity-paradox research, the macroeconomic context for sector-level AI ROI claims. Source.
  • Boston Consulting Group (2026). When Using AI Leads to Brain Fry. Study of 1,488 US workers across large companies on AI oversight load, error rates, decision overload and intent to quit. Source.

Frequently asked questions

Why do bots in client calls cause problems?

Clients often notice the bot joining and many are uncomfortable with it. Some explicitly ask not to record, some renegotiate scope to exclude AI tooling, some go quieter and the call's value drops. The privacy concern is legitimate even where the recording is technically permissible. Local-capture tools sidestep the issue by recording from the host's device without any visible joiner.

What time savings are realistic for a 15-person consulting firm?

Post-meeting time drops from 1.5 to 3 hours per meeting to 15 to 20 minutes (review and finalise the AI summary, add missing context). For a firm conducting 10 to 15 client meetings a week, that is 17 to 28 hours a week recovered. At loaded staff rates around £40 an hour, the annual benefit lands at £36,000 to £59,000. Tool costs £200 to £400 a month. Payback in 2 to 4 weeks.

Which tool fits a small services firm?

Granola or Jamie for external client calls (local capture, no bot visibility, £15 to £30 per user per month) paired with native recording in Teams, Zoom, or Google Meet for internal meetings. For a 5 to 15 person firm running 10 plus client meetings a week, total tool spend is typically £100 to £300 a month.

What does AI not do in meeting capture?

AI does not capture intent, judgement, or what the client actually needs. The transcript reflects what was said, not what was meant. Practitioners still listen actively, interpret tone, and add context after the meeting. AI saves time on mechanical transcription and basic action extraction. The meeting itself still requires presence.

This post is general information and education only, not legal, regulatory, financial, or other professional advice. Regulations evolve, fee benchmarks shift, and every situation is different, so please take qualified professional advice before acting on anything you read here. See the Terms of Use for the full position.

Ready to talk it through?

Book a free 30 minute conversation. No pitch, no pressure, just a useful chat about where AI fits in your business.

Book a conversation

Related reading

If any of this sounds familiar, let's talk.

The next step is a conversation. No pitch, no pressure. Just an honest discussion about where you are and whether I can help.

Book a conversation