AI Agent Safety Layer

AI Agents Run Dangerous Commands.
Caro Catches Them.

Your LLMs will hallucinate. Your flags will fail. Caro doesn't rely on either. Pattern-based validation that catches destructive commands—whether they came from a user or a confused AI.

Claude Code deleted entire home directory despite --dangerously-skip-permissions
Gemini CLI hallucinated file paths and deleted files that didn't exist
🎲 Hallucination Resistant 🔒 Pattern-Based Deterministic 📖 Open Source

Why Flags Aren't Enough

Real incidents from 2025 where safety measures failed

Claude Code --dangerously-skip-permissions
What happened:

AI ran rm -rf ~/ anyway, deleting the user's entire home directory

Why it failed:

The flag controls permission prompts, not command validation. The AI didn't need permission—it just executed.

— HN Discussion, Dec 2025
Gemini CLI Built-in safety checks
What happened:

AI hallucinated file paths that didn't exist, then confidently deleted real files while 'cleaning up'

Why it failed:

LLMs are stochastic. They make up data with high confidence. Safety checks can't catch hallucinations.

— HN Discussion, Jul 2025

Flag-Based Safety vs Pattern-Based Safety

Flag-Based (Others)
  • Trust the AI to remember the flag
  • Hope the flag covers all edge cases
  • Can't catch hallucinated commands
  • Probabilistic protection
  • Fails silently when AI gets creative
Pattern-Based (Caro)
  • Validates every command, regardless of source
  • 52+ dangerous patterns compiled in
  • Catches hallucinations—pattern is pattern
  • Deterministic protection
  • Explicit warnings before execution
💡
The key difference: Flags ask "did the user consent?" Caro asks "is this command dangerous?" The AI doesn't need consent to run rm -rf ~/. It needs to be stopped.

Calculate Your AI Risk

The math of probabilistic failures at enterprise scale

500
50
99.0%
Daily AI commands 25,000
×
Failure rate 1.0%
=
Potential dangerous commands 250/day
Without Caro
250 potentially dangerous commands daily that could slip through

One unlucky user getting rm -rf ~/ impacts your entire system

With Caro
0 dangerous patterns executed without warning

Deterministic validation—no dice rolls, no "demo gods"

Hundreds of developers × probabilistic AI = guaranteed failures. Caro is your insurance policy.

Get Started Free

This Happened. It Will Happen Again.

Documented incidents where AI tools caused real damage

Claude Code December 2025

Deleted user's entire home directory

Despite using --dangerously-skip-permissions flag, Claude Code executed rm -rf ~/ when asked to 'clean up temp files'. The flag only controlled permission prompts—it didn't validate the actual commands.

Impact: Complete loss of home directory including code, documents, and configuration files
Key lessons:
  • Permission flags don't validate commands
  • AI can misinterpret 'cleanup' requests destructively
  • No backup strategy can help if deletion happens instantly
Gemini CLI July 2025

Hallucinated file paths and deleted real files

Gemini CLI confidently fabricated file paths that didn't exist, then 'cleaned up' by deleting actual files in similar locations. The AI showed no uncertainty about its hallucinated information.

Impact: Deletion of project files and configuration that the user never intended to modify
Key lessons:
  • LLMs hallucinate with high confidence
  • Safety checks can't catch made-up paths
  • Stochastic systems will eventually fail
📅
Two major incidents in 2025 alone. As AI coding tools become more popular, these incidents will increase. The question isn't if it will happen to your team—it's when.

What Enterprise Teams Are Saying

Companies that deploy AI agents at scale trust Caro

"We deployed Claude agents to 200 developers. After reading about the home directory incident, Caro became mandatory. It's not optional anymore—it's our insurance policy."
Platform Engineering Lead Series B SaaS Company
After deploying AI coding assistants at scale
"The math is simple: 200 devs × 100 daily AI commands = 20,000 commands. Even 99.9% reliability means 20 potentially dangerous commands per day. Caro catches them all."
VP of Engineering Enterprise Fintech
On enterprise AI risk management
"We can't tell our engineers to 'just be careful' with AI tools. That's not a strategy. Pattern-based validation is deterministic. That's a strategy."
Senior SRE Fortune 500 Tech
On replacing behavioral policies with technical controls
52+ Dangerous patterns blocked
0 Cloud dependencies
100% Local execution
<100ms Validation overhead
Claude Code

Deleted entire home directory

--dangerously-skip-permissions flag didn't help

— HN, Dec 2025

Gemini CLI

Hallucinated file paths and deleted files

AI confidently made up paths that didn't exist

— HN, Jul 2025

These aren't edge cases. LLMs are probabilistic systems—failures are inevitable at scale.

AI Agent Deployment Best Practices

Defense in depth for AI-powered shell tools

👤

Run as Unprivileged User

Never run AI tools with sudo or as root. Create a dedicated user with minimal permissions.

useradd --no-create-home --shell /bin/false caro-agent
Caro: Caro blocks sudo/su commands and warns when running as root
📁

Sandbox to Specific Directories

Confine AI agents to specific directories. Protect /home, /etc, and system paths.

# Caro warns on operations outside working directory
Caro: Detects and warns on /home, ~, /etc, /usr operations
🐳

Container Isolation

Run AI tools in containers with no access to important data or host filesystem.

docker run --rm -v $(pwd):/workspace caro-sandbox
Caro: Works in containers with zero host access needed
🎲

Assume Hallucinations Will Happen

LLMs are probabilistic. Even with 99% accuracy, 1 in 100 commands could be dangerous.

# AI may fabricate paths: rm -rf /imaginary/but/destructive
Caro: Pattern matching catches dangerous commands regardless of source

Defense in Depth: Don't Rely on Flags Alone

1 Unprivileged User (no sudo)
2 Directory Sandboxing
3 Container Isolation
4 Caro Pre-Execution Validation

Each layer catches what the others miss. Caro is your last line of defense—not your only one.

The Math of AI Risk at Enterprise Scale

1,000 developers
×
100 AI commands/day
=
100,000 potential dangerous commands/day

Even at 99.9% AI accuracy, that's 100 potentially dangerous commands daily. One bad hallucination without Caro = catastrophe.

Common Concerns

Real questions from skeptical engineers (we get it)

🔄 Wait, Caro runs locally. How does it stay updated with new dangerous patterns? +

Caro's safety patterns are baked into the binary—no network needed. When you update Caro (cargo install caro --force), you get the latest patterns. The core dangerous commands (rm -rf /, fork bombs, disk wipers) don't change. We also accept pattern contributions via GitHub.

Will this slow down my incident response? +

No. Caro adds <100ms to command generation. The safety check is instant (pattern matching, not AI inference). In a real incident, that's 100ms that might save you from making things 10x worse. The validation is synchronous—you see the warning immediately.

🖥️ How does it know my specific system setup (BSD vs GNU, etc.)? +

Caro detects your OS and shell at runtime. On macOS, it knows you're using BSD tools. On Linux, it adjusts for GNU syntax. It reads your $SHELL and adjusts accordingly. No configuration needed—it just works.

🔓 What if I actually NEED to run a dangerous command? +

Caro warns, it doesn't jail. When you see a warning, you can still proceed—we just make sure you're doing it intentionally. For truly destructive commands (rm -rf /), you'll need to confirm. This is your seatbelt, not a straitjacket.

🔒 Is this just another AI wrapper that sends my commands to the cloud? +

No. Caro runs 100% locally. Your commands, file paths, server names, and directory structures never leave your machine. The inference happens on your hardware. We collect minimal, anonymous usage metrics to improve the product—see our telemetry page for details. Check the source code—it's AGPL-3.0 licensed.

🎯 Why should I trust AI-generated shell commands at all? +

You shouldn't trust them blindly—that's the point. Caro generates commands AND validates them before you run them. It's not 'trust the AI'—it's 'trust the pattern-based safety layer that catches what the AI might get wrong.' The validation is deterministic, not probabilistic.

Still skeptical? Good—you should be.

Read the source code →

Try Caro in 30 Seconds

No account. No API key. No data collection. Just safer shell commands.

bash <(curl --proto '=https' --tlsv1.2 -sSfL https://setup.caro.sh)

Then run:

caro "find files modified in the last 7 days"
Installs to ~/.cargo/bin
Single binary, no dependencies
Uninstall anytime: cargo uninstall caro

Prefer to build from source? See all installation options →