AI Security and Privacy: How to Use AI Without Leaking Data

Every prompt you type into an AI chatbot is data you are sending to someone else’s server. Every document you upload, every code snippet you paste, every customer name you mention — all of it travels over the internet to a data center where it is processed and, depending on the service and your plan, potentially stored and used to train future models.

In 2026, with 88% of enterprises using AI in at least one business function, the question is not whether your organization will use AI. It is whether you will use it safely. This guide covers what actually happens to your data when you use AI services, how to protect sensitive information, and what policies and tools minimize risk without blocking productivity.

Our AI security and privacy guidance draws on published vendor documentation, regulatory frameworks, and enterprise security best practices. Specific policies change frequently — always verify current terms directly with providers.

Key Takeaways
What Happens to Your Data When You Use AI
Data Policies by Provider
Consumer vs Enterprise Data Handling
How to Opt Out of Training Data Collection
Zero Trust Architecture for AI
Practical Security Measures
AI and Regulatory Compliance
Local AI: The Maximum Privacy Option
Building an AI Acceptable Use Policy
What Changed in 2026
Common Mistakes in AI Security
FAQ
Sources
Related Articles

Key Takeaways

Free tiers of most AI services use your data for model training by default. Paid plans typically offer opt-out options or no-training guarantees.
Enterprise plans are fundamentally different from consumer plans. They include contractual Data Processing Addendums (DPAs) that legally prohibit training on your data.
Local AI models (Llama, Mistral) send zero data to external servers but require technical setup and capable hardware.
A clear AI acceptable use policy is essential. Without one, employees will use AI tools anyway — just without guardrails.
The regulatory landscape is tightening. Multiple US states and the EU AI Act now require disclosures about AI data handling.

What Happens to Your Data When You Use AI

When you send a prompt to ChatGPT, Claude, Gemini, or any cloud AI service, the following typically occurs:

The Data Journey

Transmission. Your input travels encrypted (TLS) from your device to the provider’s servers. All major providers use encryption in transit.
Processing. The AI model processes your input to generate a response. This happens on the provider’s servers (or their cloud infrastructure partner).
Temporary storage. Your input and the model’s output are stored temporarily, at minimum for the duration of your conversation session.
Logging. Most providers log interactions for abuse detection, safety monitoring, and service improvement. Retention periods vary from 30 days to indefinitely.
Training (the critical variable). This is where providers differ significantly. Some use your interactions to train future model versions. Others do not. The distinction depends on your plan type and opt-out settings.

What Data Is at Risk

The most common types of sensitive data users inadvertently share with AI services:

Source code — Pasting proprietary code for debugging or review
Customer data — Names, emails, account details included in prompts
Financial information — Revenue numbers, pricing strategies, financial projections
Legal documents — Contracts, agreements, compliance reports
Internal communications — Strategy memos, performance reviews, HR discussions
Credentials — API keys, passwords, access tokens pasted into prompts
Intellectual property — Product plans, research findings, trade secrets

Data Policies by Provider

OpenAI (ChatGPT)

Consumer plans (Free, Plus, Pro):

By default, OpenAI uses your inputs and outputs to train and improve models
You can opt out via Settings > Data Controls > “Improve the model for everyone” toggle
Opting out disables conversation history (your chats will not be saved)
Even with opt-out, data may be retained for up to 30 days for abuse monitoring

Team and Enterprise plans:

Data is not used for model training by default
Enterprise accounts operate under contractual DPAs
Data is encrypted at rest (AES-256) in addition to encryption in transit
SOC 2 Type II compliance
Custom data retention policies available

Anthropic (Claude)

Consumer plans (Free, Pro):

Anthropic’s usage policy states that free-tier conversations may be used for safety research and model improvement
Pro plan users can opt out of training data contribution
Anthropic has historically been more conservative with data usage than competitors

Team and Enterprise plans:

Contractual guarantee that data is not used for training
Data Processing Addendums available
SOC 2 Type II compliance
Custom data retention

Google (Gemini)

Consumer plans:

Google AI Pro conversations may be reviewed by human raters for quality improvement
Users can manage data through Google’s My Activity controls
Workspace data (when using Gemini in Docs, Gmail, etc.) is subject to the Google Workspace DPA for paid accounts

Enterprise (Google Cloud / Vertex AI):

Customer data is not used for model training
Covered by Google Cloud’s enterprise DPA
Data residency options available (specific regions)
FedRAMP and other compliance certifications

GitHub Copilot

Individual plans:

As of March 25, 2026, GitHub announced that starting April 24, 2026, interaction data from Copilot Free, Pro, and Pro+ users will be used to train AI models unless users explicitly opt out
Opt out via Settings > Copilot > Features > “Allow GitHub to use my data for AI model training”

Business and Enterprise plans:

Code snippets are not retained after generating suggestions
No training on your code by default
Telemetry data is minimized
Enterprise adds SAML SSO and additional security controls

Consumer vs Enterprise Data Handling

The difference between consumer and enterprise AI plans is not just a matter of features or usage limits. It is a legal and architectural difference.

Consumer Plans

Aspect	Typical Consumer Plan
Data used for training	Yes, by default (opt-out available)
Contractual protections	Terms of Service only
Data retention	Varies, often indefinite
Encryption at rest	Varies by provider
Compliance certifications	Limited
Admin controls	Individual settings only
Data residency	No geographic guarantees

Enterprise Plans

Aspect	Typical Enterprise Plan
Data used for training	No (contractually prohibited)
Contractual protections	Custom DPA with legal liability
Data retention	Configurable, with deletion guarantees
Encryption at rest	AES-256 standard
Compliance certifications	SOC 2, ISO 27001, HIPAA (varies)
Admin controls	Organization-wide policies
Data residency	Region-specific options

The core distinction: consumer plans make promises in their terms of service. Enterprise plans make legally binding commitments in negotiated contracts with financial penalties for violations.

How to Opt Out of Training Data Collection

ChatGPT (OpenAI)

Open ChatGPT Settings
Navigate to Data Controls
Disable “Improve the model for everyone”
Note: This also disables conversation history

Alternatively, submit an opt-out request via OpenAI’s privacy portal or use the API (API usage is not used for training by default).

Claude (Anthropic)

Review Anthropic’s current data usage policy at https://www.anthropic.com/privacy
Pro subscribers can manage data preferences through account settings
API usage follows separate terms — data sent via the API is not used for training

Gemini (Google)

Visit myactivity.google.com
Navigate to Gemini Apps activity
Turn off “Gemini Apps Activity” to prevent future conversations from being saved and reviewed
Delete existing stored activity if desired

GitHub Copilot

Go to github.com/settings/copilot/features
Under the Privacy heading, disable “Allow GitHub to use my data for AI model training”
This must be done before April 24, 2026, when the new default takes effect

General Best Practices

Review privacy settings immediately after creating an account on any AI service
Re-check settings after major service updates (providers sometimes reset or add new defaults)
Use API access when available — most providers exclude API usage from training data by default
Document your opt-out choices for compliance purposes

Zero Trust Architecture for AI

Microsoft announced its Zero Trust for AI reference architecture in March 2026, and the principles apply broadly to any organization using AI services.

Zero Trust Principles Applied to AI

Never trust, always verify. Do not assume that an AI service is safe because it is from a major provider. Verify data handling policies, confirm encryption standards, and audit actual behavior.

Least privilege access. Not every employee needs access to every AI tool. A marketing team member does not need a coding AI agent with repository access. Match AI tool access to job function.

Assume breach. Design your AI usage policies as if a provider’s data will eventually be compromised. Never share credentials, API keys, or other authentication tokens with AI services.

Practical Zero Trust for AI Usage

Network segmentation. Route AI API traffic through monitored network paths. Use API gateways to log all requests and responses.
Data classification. Label data by sensitivity level. Define which sensitivity levels can be shared with which AI services. Block the highest sensitivity tiers from all external AI tools.
Identity and access management. Use SSO and role-based access controls for AI platform accounts. Audit who is using which tools and how frequently.
Continuous monitoring. Log AI interactions for security review. Automated tools can scan prompts for sensitive data patterns (credit card numbers, social security numbers, API keys) before they are sent to external services.
Incident response planning. Have a plan for what happens if sensitive data is inadvertently shared with an AI service. Know the provider’s data deletion procedures and response timelines.

Practical Security Measures

For Individual Users

Never paste credentials. API keys, passwords, tokens, and SSH keys should never appear in AI prompts. Use environment variable names or placeholder values instead.
Anonymize before sharing. Replace real customer names, email addresses, and account numbers with generic placeholders before pasting into AI tools.
Use paid plans. The $20/month investment for Claude Pro or ChatGPT Plus is worth it for the improved data handling and opt-out options alone.
Check before uploading. Before uploading a document to any AI service, confirm it does not contain sensitive data you would not want stored on external servers.

For Teams and Organizations

Deploy a DLP (Data Loss Prevention) gateway. Tools like Nightfall AI, Microsoft Purview, and custom solutions can scan outbound AI prompts for sensitive data and block or redact before transmission.
Create approved tool lists. Designate which AI tools are approved for work use and which are prohibited. Update the list as privacy policies change.
Provide sanctioned alternatives. If you block employees from using consumer ChatGPT, give them an approved alternative (ChatGPT Enterprise, Claude Team, or a self-hosted solution). Banning AI without providing alternatives drives usage underground.
Implement prompt templates. Pre-built templates with placeholders for sensitive fields help employees use AI effectively without inadvertently sharing confidential data.
Regular training. Quarterly reminders about AI data hygiene. Include examples of what not to share and why.

For Developers

Use API keys with minimum scope. Do not share admin-level API keys with AI coding assistants. Create read-only or limited-scope keys where possible.
Review AI-generated code for secrets. AI coding tools sometimes hallucinate plausible-looking API keys or credentials. Scan generated code before committing.
Exclude sensitive files from AI context. Use .gitignore-style exclusion rules in AI coding tools to prevent .env files, credential stores, and private keys from being sent to AI services.
Consider local models for sensitive codebases. Llama 3.3 running locally via Ollama handles many coding tasks without sending any code to external servers.

AI and Regulatory Compliance

United States

In 2026, AI regulation in the US is state-driven:

California: Requirements for AI transparency, including disclosure of AI-generated content and training data sources.
Texas: AI governance statutes effective in early 2026 requiring algorithmic logic disclosures.
Illinois: AI use in employment decisions requires disclosure and certain consent mechanisms.
Colorado: AI transparency requirements for high-risk AI systems, including insurance and hiring applications.

There is no comprehensive federal AI privacy law as of March 2026, though multiple bills are in various stages of legislative progress.

European Union

The EU AI Act, in force since mid-2025, creates a risk-based framework:

Unacceptable risk — Banned applications (social scoring, real-time biometric surveillance in public spaces with limited exceptions)
High risk — Strict requirements for AI in healthcare, education, employment, law enforcement (conformity assessments, documentation, human oversight)
Limited risk — Transparency obligations (chatbots must disclose they are AI)
Minimal risk — No specific requirements

Organizations serving EU customers must comply regardless of where they are headquartered.

Industry-Specific Requirements

Healthcare (HIPAA): AI tools processing protected health information (PHI) must be covered by Business Associate Agreements (BAAs). Standard consumer AI plans do not satisfy this requirement.
Finance (SOX, PCI-DSS): Financial data processed by AI must meet existing audit and security standards. Custom deployments with logging and access controls are typically required.
Education (FERPA): Student data shared with AI services must comply with FERPA requirements. Enterprise plans with appropriate DPAs are necessary.

Local AI: The Maximum Privacy Option

For organizations where no data can leave the network, local AI deployment offers complete control.

What Local AI Means

Running an AI model on your own hardware — a laptop, a workstation, or an on-premises server. The model operates entirely on your infrastructure. No data is transmitted to any external service.

Current Local Options

Llama 3.3 (Meta) — The most capable open-weight model family. The 70B parameter version runs on consumer hardware with quantization (a process that reduces the model’s precision to fit in less memory). Requires a GPU with 32GB+ VRAM for full performance.

Mistral Large — Competitive European alternative. Available for local deployment through various frameworks.

Ollama — A tool that simplifies running models locally. Install, pull a model, and start chatting in minutes. Supports Llama, Mistral, and dozens of other models.

LM Studio — Desktop application for running local models with a graphical interface. No command-line knowledge needed.

Trade-Offs

Factor	Cloud AI (ChatGPT, Claude)	Local AI (Llama, Mistral)
Output quality	Frontier-level	Good but not frontier
Privacy	Depends on plan and settings	Complete — no data leaves device
Setup effort	Zero	Moderate (install tools, download models)
Cost	$20+/month	Free (software), hardware cost
Speed	Fast (server GPUs)	Depends on your hardware
Context window	Up to 1M tokens	Typically 8K–128K tokens
Multimodal	Full (text, image, code, voice)	Limited (primarily text and code)

When to Use Local AI

Processing data that cannot leave your network under any circumstances (classified, HIPAA, trade secrets)
Development and testing where you need AI assistance but cannot risk code exposure
Organizations in regulated industries where cloud AI compliance is too complex or expensive
Privacy-conscious individuals who want zero data sharing on principle

Building an AI Acceptable Use Policy

Every organization using AI tools needs a written policy. Without one, employees will make their own decisions — and those decisions will not always align with your security requirements.

Core Policy Elements

1. Approved tools. List specific AI tools that are approved for work use, along with the plan type (e.g., “ChatGPT Enterprise — approved; ChatGPT Free — prohibited for work data”).

2. Data classification rules. Define what types of data can and cannot be shared with AI services:

Public data: May be shared with any approved AI tool
Internal data: May be shared with approved AI tools on enterprise plans only
Confidential data: May only be processed by locally-deployed AI or approved enterprise tools with DPAs
Restricted data: Must not be shared with any AI service (credentials, PHI without BAA, classified information)

3. Required safeguards. Mandate specific behaviors:

Anonymize customer data before sharing
Never paste credentials or API keys
Use enterprise plan settings, not personal accounts
Review AI outputs before external use

4. Reporting procedures. Define what to do if sensitive data is accidentally shared with an AI service:

Who to notify (security team, compliance officer)
What information to document (what was shared, which service, when)
Provider data deletion request procedures

5. Review cadence. AI tools and policies change rapidly. Commit to reviewing the policy quarterly and updating when providers change their terms.

What Changed in 2026

Opt-out became opt-in for more services. GitHub’s March 2026 announcement that Copilot will train on user data by default (unless opted out) reflects a broader industry trend. As AI companies need more training data, default settings increasingly favor data collection. User vigilance is more important than ever.

Zero Trust for AI became a formal framework. Microsoft’s March 2026 release of its Zero Trust for AI reference architecture provides enterprises with a structured approach to securing AI usage. A Zero Trust Assessment for AI pillar is expected in summer 2026.

State-level regulation accelerated. With no federal AI privacy law, US states are acting independently. Texas, California, Illinois, and Colorado all have AI-specific statutes taking effect in the first half of 2026. Compliance requires awareness of which states’ laws apply to your operations.

Enterprise AI governance matured. 68% of privacy professionals have now acquired AI governance responsibilities. AI is no longer just an IT concern — it is a compliance, legal, and risk management issue integrated across business functions.

Context windows created new risk surface. With Claude and Gemini supporting 1M-token context windows, users can now upload entire codebases, complete legal document sets, or years of email history in a single session. This dramatically increases both the utility and the data exposure risk of AI interactions.

Common Mistakes in AI Security

Assuming “don’t use AI” is a workable policy. It is not. Employees will use AI tools regardless of prohibitions. The result is unmonitored, ungoverned AI usage with zero security controls. A better approach is providing approved tools with clear guidelines.

Using personal accounts for work. An employee using their personal ChatGPT Free account for work tasks means your company’s data is being processed under consumer terms with potential training data usage. Provide and mandate team or enterprise accounts.

Trusting opt-out settings without verifying. Opt-out toggles can reset after updates, and the specific data covered by opt-out varies. Periodically verify that your settings are still active and understand exactly what they cover.

Ignoring API key security. Developers regularly paste API keys into AI prompts for debugging help. A single leaked production API key can compromise entire systems. Train developers to use placeholder values and never share real credentials.

Overlooking file uploads. Uploading a document to an AI service shares its entire contents. A PDF that contains a single paragraph of proprietary data alongside public information still exposes that proprietary data. Review documents before uploading.

Failing to update policies when providers change terms. AI service terms change frequently. GitHub’s March 2026 policy change is one example. Assign someone to monitor provider policy updates and adjust your internal policies accordingly.

FAQ

Does ChatGPT use my data for training?

On consumer plans (Free, Plus, Pro), yes, by default. You can opt out via Settings > Data Controls > “Improve the model for everyone,” but this also disables conversation history. On Team and Enterprise plans, your data is not used for training, backed by contractual DPAs. API usage is not used for training by default, regardless of plan.

Is Claude more private than ChatGPT?

Anthropic has historically been more conservative with data usage. Claude’s policies are generally more restrictive about training data collection. However, the safest approach with any cloud AI service is to use enterprise-tier plans with contractual protections rather than relying on published privacy policies alone, as policies can change.

Can AI tools leak my source code?

There is no confirmed case of a major provider directly exposing one user’s code to another user. However, if your code is used as training data, patterns and structures from it could influence future model outputs. The practical risk is low but not zero. Use enterprise plans or local models for highly sensitive code.

What should our company’s AI policy include?

At minimum: a list of approved AI tools and plan types, data classification rules defining what can be shared with each tool tier, mandatory anonymization requirements for customer data, a prohibition on sharing credentials, a reporting procedure for accidental data exposure, and a quarterly review schedule.

Is running AI locally really more secure?

For data privacy, yes — unequivocally. Data processed by a local model never leaves your machine. However, local models introduce their own security considerations: you need to keep models and tools updated, secure the hardware itself, and ensure the model weights were downloaded from legitimate sources.

If you process EU personal data with AI, you need: a lawful basis for processing (typically legitimate interest or consent), a DPA with your AI provider, a record of processing activities that includes AI usage, the ability to respond to data subject access requests (including AI interactions), and compliance with the EU AI Act’s transparency requirements. Consult legal counsel for your specific situation.

What is the biggest AI security risk for businesses in 2026?

Unmonitored employee use of consumer AI tools with sensitive company data. This is not a hypothetical — surveys indicate the majority of knowledge workers use AI tools that their IT departments have not approved. Providing sanctioned alternatives with clear policies is far more effective than attempting to block AI usage entirely.

Sources

Microsoft Security Blog — Zero Trust for AI (March 2026): https://www.microsoft.com/en-us/security/blog/2026/03/19/new-tools-and-guidance-announcing-zero-trust-for-ai/
Anthropic Privacy Policy: https://www.anthropic.com/privacy
OpenAI Enterprise Privacy: https://openai.com/enterprise-privacy
Google Cloud DPA: https://cloud.google.com/terms/data-processing-addendum
EU AI Act: https://artificialintelligenceact.eu
Hyperproof Data Protection Strategies 2026: https://hyperproof.io/resource/data-protection-strategies-for-2026/
GitHub Copilot Privacy Policy Update (March 2026): https://github.blog

AI Security and Privacy: How to Use AI Without Leaking Data

Table of Contents

Key Takeaways

What Happens to Your Data When You Use AI

The Data Journey

What Data Is at Risk

Data Policies by Provider

OpenAI (ChatGPT)

Anthropic (Claude)

Google (Gemini)

GitHub Copilot

Consumer vs Enterprise Data Handling

Consumer Plans

Enterprise Plans

How to Opt Out of Training Data Collection

ChatGPT (OpenAI)

Claude (Anthropic)

Gemini (Google)

GitHub Copilot

General Best Practices

Zero Trust Architecture for AI

Zero Trust Principles Applied to AI

Practical Zero Trust for AI Usage

Practical Security Measures

For Individual Users

For Teams and Organizations

For Developers

AI and Regulatory Compliance

United States

European Union

Industry-Specific Requirements

Local AI: The Maximum Privacy Option

What Local AI Means

Current Local Options

Trade-Offs

When to Use Local AI

Building an AI Acceptable Use Policy

Core Policy Elements

What Changed in 2026

Common Mistakes in AI Security

FAQ

Does ChatGPT use my data for training?

Is Claude more private than ChatGPT?

Can AI tools leak my source code?

What should our company’s AI policy include?

Is running AI locally really more secure?

How do I know if my AI usage complies with GDPR?

What is the biggest AI security risk for businesses in 2026?

Sources

Related Articles