Does my AI chatbot need to be GDPR compliant?

If your chatbot processes personal data of people in the UK or EU — and it almost certainly does (names, email addresses, conversation content, IP addresses, device data) — then yes. GDPR applies regardless of where your business is based. The ICO has specifically identified AI chatbots as an area of enforcement focus.

Can I use ChatGPT or Claude API for a GDPR-compliant chatbot?

Yes, but with safeguards. Both OpenAI and Anthropic offer Data Processing Agreements, EU data residency options, and zero-data-retention API configurations. You need to: sign their DPA, enable zero-retention mode so conversations aren't used for training, configure EU data residency where available, and document the data flows in your DPIA. Using the API with proper configuration is GDPR-compatible. Using the consumer chat interface for business purposes is not.

What personal data does a chatbot collect?

More than you think. Direct data: anything the user types (names, email addresses, order numbers, complaints, health information if they mention it). Indirect data: IP addresses, device information, session IDs, timestamps, conversation history, browser fingerprint. Derived data: sentiment analysis, topic classification, user preferences inferred from conversations. All of this is personal data under GDPR.

Do I need a DPIA for my AI chatbot?

Almost certainly yes. The ICO says DPIAs are required for processing using new technologies (AI qualifies) and systematic monitoring of individuals (chatbot conversations qualify). A DPIA documents what data you process, why, the risks to individuals, and what safeguards you have in place. It's both a legal requirement and a genuinely useful exercise that forces you to think about data protection before you launch.

How do I handle conversation data retention?

Keep conversations only as long as needed. For customer support chatbots, 30-90 days covers most follow-up needs. For sales chatbots, retain until the enquiry is resolved plus a reasonable period. Never retain indefinitely. Configure your LLM provider for zero-retention (conversations not stored or used for training). Store conversation logs in your own database with automatic deletion after your retention period. Tell users in your privacy notice how long you keep their conversations.

Build a GDPR-Compliant AI Chatbot: Architecture, Costs & Mistakes to Avoid

I keep hearing the same thing from CTOs: "We'd love to add AI chat but GDPR makes it impossible." They picture consent pop-ups stacked three deep, legal disclaimers longer than the conversation, and a user experience so bad nobody would actually use it.

That's not how it works. The businesses that end up with terrible, compliance-heavy chatbot experiences are the ones that built first and panicked about GDPR second. If you design for compliance from the start, the user never notices it's there.

I've built compliant chatbot systems that process thousands of conversations a day. The compliance layer adds maybe 15-20% to the build cost and zero friction to the experience. Here's the architecture.

Get articles like this every Tuesday. Compliance Engineering — practical AI compliance for engineers and founders. Free, weekly, written by a CIPP/E certified practitioner who actually builds these systems.

Want this built properly from the start? A £500 scoping review covers your architecture, provider setup, DPIA scope, and the full compliance documentation you'll need. One week, written report.

Your server sits in the middle. Always.

Every GDPR-compliant chatbot follows one pattern:

User (browser) → Your Server → LLM API → Your Server → User

Never this:

User (browser) → LLM API directly

This matters more than anything else in this article. When your server sits between the user and the LLM, you control what data gets sent to the model, what gets stored, and what gets deleted. Without that middleware layer, you've handed data control to OpenAI or Anthropic or whoever runs the model. And you're still the data controller in GDPR terms — you're just a data controller who can't actually control anything.

The frontend handles the user-facing stuff: showing the AI disclosure ("You're chatting with an AI assistant"), linking to your privacy notice, collecting consent where needed, and providing the "talk to a human" escape hatch.

The backend — your server — is where compliance actually lives. It receives user messages, strips out personal data the LLM doesn't need, manages conversation history and context windows, enforces retention policies by auto-deleting old conversations, handles data subject access requests, and runs audit logging. This is the compliance engine. Everything else is window dressing.

The LLM layer is just an API call. Your server sends a sanitised prompt, gets text back. The provider should have a signed DPA with you, be configured for zero data retention, and not use your conversations for training.

The integration layer connects to your CRM, ticketing system, knowledge base, whatever. Each connection is a data flow you need to document in your DPIA.

One principle runs through all of it: data flows through your server, not around it. You're the data controller. Act like one.

Choosing Your LLM Provider (The GDPR Way)

Not all LLM providers are equal from a data protection standpoint. Here's how the main options compare in 2026:

OpenAI API (GPT-4o, GPT-4.5)

DPA: Available, sign it before you write a line of code
Data retention: Zero-retention available via API settings. Opt in explicitly — it's not the default
Training: API data not used for training when zero-retention is enabled
Data residency: EU hosting available through Azure OpenAI Service
Sub-processors: Published list, regularly updated
Verdict: Solid choice if you configure it properly. The Azure route gives you EU data residency, which simplifies your DPIA

Anthropic Claude API

DPA: Available
Data retention: Doesn't train on API data by default — this is a meaningful distinction
Training: API conversations excluded from training unless you opt in
Data residency: US-based processing (EU options expanding)
Sub-processors: Published list
Verdict: The default no-training policy is a strong starting point. Good for businesses that want fewer configuration steps

Self-Hosted (Llama 3, Mistral, Qwen)

DPA: Not applicable — you host it yourself
Data retention: Entirely under your control
Training: Your data never leaves your infrastructure
Data residency: Wherever you host it
Verdict: Maximum data control. But you're paying £200-£500/month for GPU hosting, and the model quality is a step behind the leaders for complex conversations. Good for regulated industries where data can't leave your environment

What to Check Before You Sign

For any provider, verify these four things:

DPA available and signed — not "available on request," actually signed
Data residency options — where does data go during processing?
Training data policy — are your conversations feeding the next model version?
Sub-processor list — who else touches your data?

If the provider can't give you clear answers to all four, pick a different provider.

What GDPR actually requires from your chatbot

GDPR is principles-based — there's no checklist you tick and walk away. But there are specific requirements that apply to chatbots, and most of them come down to architectural decisions you make once, not ongoing headaches.

Lawful basis

You need a legal reason to process the data. For customer support chatbots, legitimate interest works — you have a genuine business need to answer queries, and customers reasonably expect it. Document this in a Legitimate Interest Assessment. For sales or marketing chatbots that proactively engage visitors, you'll probably need consent.

Don't default to consent for everything. Consent can be withdrawn mid-conversation, which breaks your chatbot's ability to function. Legitimate interest is more stable when it genuinely applies.

Transparency

Users need to know they're talking to an AI (the EU AI Act makes this a legal requirement from August 2025), what data you're collecting, and what happens to it. One line above the chat input handles all three: "You're chatting with an AI assistant. Conversations deleted after 90 days. [Privacy notice]"

That's it. Not a modal. Not a wall of text. One line.

Data minimisation

This is where your architecture earns its money. When a user gives their order number, look up the order in your system and send the order details to the LLM — not the user's full name, email, and address. The LLM doesn't need identifying information to tell someone their package shipped yesterday.

Limit your context window too. Don't send the entire conversation history with every API call. The last 5-10 messages is usually enough. And if the chatbot handles FAQs, it doesn't need the user's email address. Don't ask for information the query doesn't require.

Purpose limitation

A support conversation is for resolving the support issue. You can't later feed it into a marketing segmentation model or use it to fine-tune a custom model without separate consent. Tag conversations with their purpose in your database. Enforce access controls so marketing can't query support logs.

Retention

Set retention periods and enforce them automatically. Support conversations: 30-90 days. Sales enquiries: until resolved plus 30 days. General FAQ: 7-14 days. Complaints and disputes: 6-12 months.

Build a cron job that purges expired conversations every night. Don't rely on someone remembering to do it manually — they won't.

And configure your LLM provider for zero retention. Your conversations should live in YOUR database with YOUR retention rules, not sitting on OpenAI's servers for 30 days because you forgot to change the default.

Security

Standard web application security applies — TLS 1.2+ everywhere, encryption at rest for conversation logs, access controls, audit logging, API keys in environment variables not hardcoded in source. The chatbot-specific addition is prompt injection protection: making sure users can't trick your chatbot into revealing system prompts, other users' data, or doing things it shouldn't.

Data subject rights

People can request access to their conversations, deletion of their data, a portable export, or object to processing entirely. You need to respond within 30 days under UK GDPR.

Practically: store conversations linked to a user identifier. Build an export function and a deletion function that work on a per-user basis. Actually test both before you go live. If you're using zero-retention with your LLM provider, deletion is simpler — you only need to wipe your own database.

Consent and Transparency Done Right

Here's what good chatbot consent looks like. Not a wall of legal text. Not a dark pattern. Just clear information.

Above the chat input:

"Hi! I'm [Company]'s AI assistant. I can help with orders, returns, and general questions. [Privacy notice] | [Talk to a human]"

Before collecting personal details:

"I'll need your order number to look that up. We'll use it only to find your order and delete this conversation after 90 days."

Cookie/tracking consent: If your chatbot widget sets cookies or uses analytics, handle that through your existing cookie consent mechanism. Don't add a second consent layer.

EU AI Act disclosure: From August 2025, you must tell users they're interacting with an AI system. The line "I'm [Company]'s AI assistant" covers this. Simple.

The pattern: be honest, be brief, be accessible. People don't read long notices. They do read one-line disclosures.

What It Costs: Compliant vs. Non-Compliant

Let's talk money, because that's usually the real question.

Building GDPR-compliant from the start:

Item	Cost
Chatbot development (mid-complexity)	£5,000-£8,000
GDPR compliance layer (DPA, privacy notice, consent, retention, SAR handling)	£1,000-£2,000
DPIA	£1,000-£2,000
Total	£7,000-£12,000

Retrofitting compliance after a complaint:

Item	Cost
Emergency GDPR audit	£2,000-£5,000
Re-architecture (adding server middleware, retention, deletion)	£3,000-£8,000
DPIA (rushed)	£1,500-£3,000
Legal advice (because someone complained)	£2,000-£5,000
Total	£8,500-£21,000

After an ICO enforcement action:

Item	Cost
Everything above	£8,500-£21,000
ICO fine (for SMEs, typically)	£5,000-£500,000
Reputation damage	Incalculable

The compliance layer adds roughly 15-20% to the build cost. Retrofitting costs 2-3x more. An enforcement action costs 10-50x more. The maths is straightforward.

Where I see businesses get it wrong

The same mistakes, over and over. All avoidable.

Using consumer ChatGPT for business data. The consumer product (chat.openai.com) and the API are different things with different data handling. Consumer conversations may feed training. The API, configured with zero-retention and a signed DPA, is GDPR-compatible. The consumer chat used to process customer complaints is not. I still see teams pasting customer emails into the free version. It's a data protection incident happening in slow motion.

No DPA with the LLM provider. Signing OpenAI's DPA takes ten minutes. It's in your account settings. Not signing it makes every single conversation a compliance violation. There is genuinely no excuse for this one.

Keeping conversations forever. "We might need them later" is not a retention policy. It's the absence of one. Set a period. Build the cron job. Delete on schedule.

No deletion capability. Someone sends a Subject Access Request or erasure request. You have 30 days to respond under UK GDPR. If your system wasn't built to find and delete a specific user's conversations, that's going to be a very stressful month.

Privacy notice that doesn't mention AI. You added a chatbot. Your privacy notice still says nothing about AI processing, who the sub-processor is, or where the data goes. Most businesses forget this step entirely.

No human escalation. GDPR gives people the right not to be subject to solely automated decision-making. If your chatbot handles complaints, billing disputes, or anything with real consequences — there needs to be a "talk to a human" button. Not buried in a menu. Visible.

Sending raw user data to the LLM. A customer mentions their health condition while asking about a refund. Your server sends the entire message, health details included, to the model. The LLM doesn't need that. Strip what's unnecessary before the API call. It's better for compliance and it actually produces better responses — less noise, more signal.

Where to start

Sign the DPA with your LLM provider before you write a single line of code. If you're considering OpenAI, we wrote a detailed guide to making ChatGPT API GDPR compliant.

Then design the architecture — server in the middle, always. Build the chatbot functionality. Add the compliance layer: consent, transparency, retention, deletion, SAR handling. Complete the DPIA. Update your privacy notice. And test the data subject rights flows end to end. Can you actually find, export, and delete a specific user's data? Don't assume it works. Prove it.

If you want this done right from the start rather than paying twice to fix it later — that's the service we offer. One team, one engagement, working chatbot plus all the compliance documentation.