guideMar 22, 20267 min read

Why We Don't Store Prompts (And Why Your Observability Tool Shouldn't Either)

Most LLM monitoring tools store your prompts and completions by default. Here's why that's a problem — and a better approach.

Why We Don't Store Prompts (And Why Your Observability Tool Shouldn't Either)

Here's a question most developers don't ask when they add an LLM monitoring tool: Where do my prompts go?

The answer, for every major tool in this space, is: to a third-party server, stored indefinitely, alongside the model's full response.

That means your users' messages, your proprietary system prompts, your few-shot examples, and the model's outputs — all sitting in someone else's database. If you're building anything that touches customer data, this should worry you.

What Most Observability Tools Store

Let's be specific. Here's what a typical LLM monitoring request log contains:

{
  "timestamp": "2026-03-22T14:30:00Z",
  "model": "gpt-4o",
  "prompt_tokens": 1420,
  "completion_tokens": 380,
  "messages": [
    {
      "role": "system",
      "content": "You are a financial advisor for Acme Corp. Use the following customer data: Name: Jane Smith, Account #12345, Balance: $47,293..."
    },
    {
      "role": "user",
      "content": "Should I refinance my mortgage given current rates?"
    },
    {
      "role": "assistant",
      "content": "Based on your current balance of $47,293 and the available rates..."
    }
  ]
}

In this one log entry:

Customer PII: Name, account number, financial balance
Proprietary business logic: The system prompt with Acme Corp's advisory rules
User intent data: What the customer is considering doing with their money
Model output: Financial advice generated for a specific individual

All of this is now stored on a third-party server. Forever. Under their data retention policy, not yours.

The Three Risks

1. GDPR and Data Protection Liability

If you process EU user data, sending prompts containing personal information to a third-party monitoring tool creates a new data processing relationship. Under GDPR:

You need a Data Processing Agreement (DPA) with the monitoring tool provider
You must disclose this in your privacy policy ("We share your messages with XYZ for monitoring purposes")
Users have a right to deletion — can you delete their prompts from the monitoring tool?
Data transfers outside the EU require additional safeguards (SCCs, adequacy decisions)

Most developers add monitoring tools without updating their privacy policy or signing a DPA. That's a compliance gap.

2. Prompt Injection and Data Exfiltration

If an attacker can access your monitoring tool's dashboard or API, they get:

Every system prompt (your proprietary logic)
Every user message (potential PII, business data)
Every model response (including any data the model leaked from its context)

Your monitoring tool becomes a single point of data exfiltration for everything your AI system processes.

3. Intellectual Property Exposure

System prompts contain your competitive advantage: the instructions, formatting rules, few-shot examples, and business logic that make your AI features work. Storing them in a third-party tool means:

The tool's employees can potentially access them
A breach of the monitoring tool exposes your IP
The monitoring tool's AI features might train on your prompts (check their ToS)

What You Actually Need for Cost Monitoring

Here's the key insight: you don't need prompt content to track costs.

To answer "how much am I spending and where?" you need exactly this:

{
  "timestamp": "2026-03-22T14:30:00Z",
  "model": "gpt-4o",
  "tokens_in": 1420,
  "tokens_out": 380,
  "tags": {
    "feature": "financial-advisor",
    "customer_tier": "premium",
    "environment": "production"
  }
}

That's it. No prompts, no completions, no PII. Just metadata and token counts.

From this, you can answer:

Question	How
Which feature costs the most?	Group by `feature` tag
Are we using the right model?	Compare cost per model per feature
Is staging inflating the bill?	Filter by `environment` tag
Which customer tier is most expensive?	Group by `customer_tier` tag
Are costs trending up or down?	Time series on total spend
Where can we save money?	Waste detection on model/feature combinations

You don't need to see "Should I refinance my mortgage?" to know that the financial-advisor feature costs $310/month and could save $160/month by switching to GPT-4o-mini.

The Tags-Only Approach

Tags-only monitoring means:

Send: model name, token counts, cost, and custom tags
Don't send: messages, system prompts, completions, tool calls, or any content

What Gets Stored

model: gpt-4o
tokens_in: 1420
tokens_out: 380
cost: $0.0074
tags: { feature: "chatbot", user_tier: "free", environment: "production" }

What Does NOT Get Stored

messages: [...]        ← NEVER sent
system_prompt: "..."   ← NEVER sent
completion: "..."      ← NEVER sent
function_calls: [...]  ← NEVER sent

The Privacy Guarantee

With tags-only monitoring:

GDPR: No personal data leaves your infrastructure. No DPA needed with the monitoring tool. No privacy policy update required.
Security: A breach of the monitoring tool reveals usage patterns, not data. No PII, no prompts, no business logic exposed.
IP protection: Your system prompts stay in your codebase. They're never transmitted to or stored by a third party.

"But I Need Prompt Debugging"

Fair point. When something goes wrong, you want to see what the model received and what it returned.

The answer: log prompts locally, not in a third-party tool.

Capability	Where to Do It
Cost tracking and attribution	Third-party monitoring (tags only)
Waste detection and recommendations	Third-party monitoring (tags only)
Budget alerts	Third-party monitoring (tags only)
Prompt debugging	Local logging (your own infrastructure)
Response quality evaluation	Local logging or evaluation tool
Compliance audit trail	Your own database

Use your existing logging infrastructure (CloudWatch, Datadog, ELK) for prompt-level debugging. Send only tags and token counts to your cost monitoring tool. This gives you full debugging capability without sending data to another third party.

How the Industry Handles Prompt Storage

Tool	Stores Prompts by Default?	Opt-Out Available?
Helicone	Yes	Partial (can disable logging per request)
Langfuse	Yes	Yes (self-host for full control)
Braintrust	Yes	Enterprise can configure retention
Portkey	Yes	Enterprise can configure
Respan (fka Keywords AI)	Yes	Unknown
AISpendGuard	No — never	N/A — no prompt data is ever accepted

We don't offer a "disable prompt storage" toggle because there's nothing to disable. The SDK doesn't accept prompt content. The API rejects requests containing message content. It's architecturally impossible for us to store your prompts.

The Decision

If you're choosing an LLM monitoring tool, ask three questions:

Does it store my prompts by default? If yes, you have a new data liability.
Can I opt out of prompt storage? If yes, will it still provide cost tracking and waste detection?
Does it work with tags only? If yes, you get cost visibility without data risk.

For cost monitoring and waste detection, you don't need prompt storage. Tags and token counts give you everything you need to find and fix wasted spend — without creating new privacy, security, or compliance risks.

We built AISpendGuard on this principle from day one. Tags-only ingestion, no prompt storage, no content logging — ever. GDPR-compliant by architecture, not by configuration.

EUR pricing. EU-hosted (Frankfurt). Privacy-first.

Free tier: 50,000 events/month. No credit card required.

Start tracking at aispendguard.com

This post reflects the privacy landscape for LLM monitoring tools as of March 2026. Always check the current privacy policies and data processing agreements of any tool you integrate.

Why We Don't Store Prompts (And Why Your Observability Tool Shouldn't Either)

Why We Don't Store Prompts (And Why Your Observability Tool Shouldn't Either)

What Most Observability Tools Store

The Three Risks

1. GDPR and Data Protection Liability

2. Prompt Injection and Data Exfiltration

3. Intellectual Property Exposure

What You Actually Need for Cost Monitoring

The Tags-Only Approach

What Gets Stored

What Does NOT Get Stored

The Privacy Guarantee

"But I Need Prompt Debugging"

How the Industry Handles Prompt Storage

The Decision

Want to track your AI spend automatically?