Why I Isolate AI Agents: The March 2026 LiteLLM Attack

Why I run my AI agents in isolated infra , and why March 24th proved me right

I get a lot of raised eyebrows when I tell people I run all my AI agent experiments in isolated infrastructure. Separate VMs. No cloud credentials mounted. Network egress restricted. "Isn't that overkill for hobby stuff?" Maybe. But then March 24, 2026 happened, and I spent the day watching my timeline fill with developers discovering their SSH keys, AWS credentials, Kubernetes configs, and crypto wallet seeds had just been quietly exfiltrated , courtesy of a Python package they trusted.

The package was litellm. The attack was elegant, the cleanup was brutal, and the lessons are directly relevant to anyone who uses AI tooling on their dev machine. Let me walk you through it.

What Even Is LiteLLM?

Quick context if you're not deep in the AI stack: LiteLLM is a Python library that gives you one unified API to talk to 100+ LLM providers , OpenAI, Anthropic, Gemini, Bedrock, you name it. Instead of writing provider-specific code for each model, you call LiteLLM and it handles the translation layer.

It's extremely popular. About 3.4 million downloads per day popular.

It gets used in two ways: as a Python SDK in your code, and as a standalone proxy server that your entire org routes model calls through. That second use case is important. When you run LiteLLM as a proxy, that one machine holds API keys for every AI provider your team uses. From an attacker's perspective, that's a very interesting machine to compromise.

How the Attack Got In (This Part Is Wild)

The attackers didn't phish a developer. They didn't brute-force anything. They played the long game through the toolchain.

Five days before the attack, they compromised trivy-action , the GitHub Action for Trivy, an open-source container security scanner made by Aqua Security. They rewrote the Git tags in the repo to point to a malicious release. LiteLLM used Trivy in its CI/CD pipeline, pulling it from apt without a pinned version.

Think about that for a second. A security scanner was the entry point.

When LiteLLM's CI next ran, it pulled the poisoned Trivy action, which silently exfiltrated the PYPI_PUBLISH token from the GitHub Actions runner environment. The attackers now had direct publish rights to the litellm package on PyPI.

The day before the attack, they registered models.litellm.cloud , a domain crafted to look official, registered a single day before it would be used as the exfiltration endpoint.

On March 24, they published two malicious versions in 13 minutes.

Here's the detail that should give any DevOps person pause: neither version appears anywhere in the LiteLLM GitHub release history. The repo only goes up to v1.82.6.dev1. Versions 1.82.7 and 1.82.8 were uploaded directly to PyPI using the stolen token, bypassing every CI/CD workflow, every review, every safeguard the team had in place. The package registry was updated. The repository never was.

The Two Payloads

The attackers published two versions, each with a different delivery mechanism , suggesting they were iterating in real time.

v1.82.7 embedded the malicious payload inside litellm/proxy/proxy_server.py. It fires when anything imports litellm.proxy, which is the standard import path for running LiteLLM's proxy server.

v1.82.8 went further. It added a file called litellm_init.pth to site-packages. If you're not familiar with .pth files: Python automatically executes them on every interpreter startup. Not on import. Not on first use. On every startup , including when you run pip, when your IDE's language server initializes, when a subprocess spawns. No import litellm required, ever.

The .pth payload looks like this:

import os, subprocess, sys
subprocess.Popen([sys.executable, "-c", "import base64; exec(base64.b64decode('...'))"])

Double base64-encoded, so it survives naive grep. And here's the kicker: the file is correctly declared in the wheel's RECORD with a valid checksum:

litellm_init.pth,sha256=ceNa7wMJnNHy1kRnNCcwJaFjWX3pORLfMh7xGL8TUjg,34628

pip install --require-hashes would pass. You're verifying you received exactly what the attacker published, and you did. The integrity guarantees of the package ecosystem assume the signing credentials are trustworthy. Once those are stolen, that assumption is gone.

What the Malware Actually Did

The community reverse-engineered the decoded payload within hours of the disclosure. Here's what it harvested from affected machines:

Credentials and keys: SSH private keys (id_rsa, id_ed25519, id_ecdsa), .env files, shell history across bash/zsh/sh, .netrc, .npmrc, .vault-token, .git-credentials, .gitconfig.

Cloud credentials: AWS credentials file + config, live IMDS queries with full IMDSv2 signing to pull from Secrets Manager and SSM Parameter Store, GCP application default credentials, Azure token files.

Kubernetes: ~/.kube/config, all of /etc/kubernetes/*.conf, in-cluster service account tokens and certificates.

Containers: Docker config.json from every known path, including Kaniko.

CI/CD configs: terraform.tfvars, .gitlab-ci.yml, .travis.yml, Jenkinsfile, .drone.yml. Anything that might contain secrets that ended up in version-controlled configs.

Databases and services: .my.cnf, .pgpass, .mongorc.js, Redis CLI history.

Crypto wallets: Bitcoin, Ethereum keystore, Solana, Cardano, Monero, Ledger hardware wallet configs.

Everything was encrypted (AES-256-CBC, session key wrapped with a hardcoded 4096-bit RSA public key) and POST'd to https://models.litellm.cloud/. Temporary files , session.key, payload.enc, tpcp.tar.gz , were created in /tmp during exfiltration.

If the malware found a Kubernetes service account token, it went further. It read all secrets across every namespace in the cluster, then attempted to deploy privileged alpine:latest pods to every node in kube-system, mounting the host filesystem and installing a persistent backdoor. Pods were named node-setup-{node_name}.

The backdoor itself , written to ~/.config/sysmon/sysmon.py and registered as a systemd user service , polls https://checkmarx.zone/raw every five minutes for a URL and executes whatever it finds. The attacker can push live payloads to compromised machines at will.

The Bug That Saved People (Accidentally)

Here's the most darkly ironic part of this whole story.

The .pth mechanism fires on every Python startup. The first thing the payload does is spawn a new Python subprocess. That subprocess also triggers .pth execution since litellm_init.pth is still in site-packages. Which spawns another. Which spawns another.

An unintended fork bomb , a bug in the malware itself.

This is why Callum McMahon at FutureSearch noticed anything was wrong in the first place. His 48GB Mac ground to a halt. htop took tens of seconds to open. 11,000 processes running. Without that mistake, the payload would have exfiltrated credentials silently in the background, planted its backdoor, cleaned up temp files, and disappeared. Nobody would have known until someone tried to use a rotated key and found it already being used.

As Andrej Karpathy put it on X: the malware's own poor quality is what made it visible.

The Disclosure: Community 1, Attackers 0

Once Callum's team identified the malicious package, they posted a detailed technical disclosure in GitHub issue #24512 at 11:48 UTC. It hit Hacker News about 45 minutes later and reached 324 points.

The attackers responded by flooding the issue with 88 bot comments from 73 previously-compromised developer accounts in a 102-second window. Then they used the stolen krrishdholakia maintainer account , the actual LiteLLM CEO's account , to close issue #24512 as "not planned."

The community opened a new tracking issue (#24518), noted what had happened, and kept the discussion alive on Hacker News. PyPI quarantined both versions at ~13:38 UTC. Total exposure window: about three hours.

By 15:09 UTC, the LiteLLM maintainers confirmed all GitHub, Docker, and PyPI credentials had been rotated and maintainer accounts moved to new identities. Google's Mandiant team was brought in for forensic analysis of the build pipeline.

Major downstream projects , DSPy, MLflow, CrewAI, OpenHands, Arize Phoenix , filed emergency PRs to pin away from the compromised versions the same day.

This Wasn't a One-Off. It Was Phase 09.

The group behind this, tracked as TeamPCP, has been running an ongoing campaign since at least December 2025. LiteLLM was Phase 09.

The same RSA public key appears in the Trivy, KICS (a Checkmarx IaC scanner), and LiteLLM payloads. Same tpcp.tar.gz naming. Same infrastructure registrar. The target selection across all three is deliberate: each is a tool that requires elevated, broad access to the systems it operates on. A container scanner, an IaC scanner, an LLM gateway , all of them sit deep inside CI/CD pipelines and developer machines, with legitimate reasons to read credentials.

TeamPCP also deployed something called CanisterWorm, which uses the Internet Computer Protocol (ICP) as a C2 channel. ICP canisters can't be taken down by domain registrars or hosting providers. They're also apparently using an AI agent for automated attack targeting. Supply chain attacks are now getting automated. Fun times.

What This Means If You Run AI Tooling (Read: Probably You)

Here's the thing that makes this incident different from a typical npm leftpad situation. The AI developer ecosystem has converged on patterns that are genuinely great for productivity and genuinely terrible for security:

uvx and npx auto-pull the latest version of everything. When Cursor loads an MCP server, it runs it via uvx, which automatically resolves and downloads dependencies. Unpinned, from the internet, on your dev machine, which has your AWS credentials, SSH keys, and Kubernetes config sitting in well-known paths that have been in default locations for twenty years.

Transitive dependencies are invisible. Callum didn't install litellm. His MCP server had an unpinned litellm dependency. uvx pulled the latest version, which happened to have been maliciously published 13 minutes earlier. The attack surface was a dependency of a plugin of an IDE.

LLM gateways are credential aggregators by design. If you're running LiteLLM as a proxy , which is the recommended production pattern , that machine holds API keys for every model provider you use. Compromising it is a one-stop shop.

For what it's worth: this is exactly why I run AI experiments in isolated infra. Not because I'm paranoid, but because the ergonomics of the AI tooling ecosystem , auto-pulling dependencies, local execution, broad filesystem access , are a different threat model than running a web server. A compromised nginx config doesn't exfiltrate your AWS credentials. A compromised Python package that fires on every interpreter startup might.

What You Should Actually Do

If you installed litellm between 10:39 and ~13:38 UTC on March 24, 2026, assume the machine is compromised regardless of whether you ran any application code. The .pth mechanism fires during pip install itself.

Check for the persistence backdoor:

ls ~/.config/sysmon/sysmon.py
systemctl --user status sysmon.service

Check for the .pth file:

find $(python3 -c "import site; print(' '.join(site.getsitepackages()))") \
  -name "*.pth" -exec grep -l "base64\|subprocess\|exec" {} \;

Check Kubernetes:

kubectl get pods -A | grep node-setup-

Then rotate everything: SSH keys, cloud credentials, API keys, database passwords, Kubernetes tokens. Audit AWS Secrets Manager and SSM Parameter Store if instance metadata was accessible. It's a brutal checklist but a necessary one.

Going forward, regardless of whether you were affected:

Pin your dependencies. Use lock files with checksums. Unpinned transitive dependencies are your attack surface.
Audit .pth files in your environments. Most legitimate packages don't install them. If you see one you don't recognize: that's a red flag.
Treat your dev machine like it has prod credentials. Because it probably does.
If you run MCP servers locally, check their dependency manifests. Anything pulling in unpinned versions of large, popular libraries is an exposure.
Consider isolated infra for AI agent experiments. A VM with no cloud credentials mounted, egress restricted to what it actually needs. Yes, it's friction. It's also a lot less friction than rotating all your credentials and auditing your Kubernetes cluster.

The Thing That Sticks With Me

The AI tooling security conversation usually centers on prompt injection , tricking LLMs into doing bad things, the "lethal trifecta" of tool use, memory, and exfiltration. That's a real and evolving threat.

But the attack that actually hit people on March 24th required no AI manipulation whatsoever. No jailbreaking. No clever prompt. Just stolen CI/CD credentials, a malicious PyPI upload, and Python's decades-old .pth mechanism doing exactly what it was designed to do. The most sophisticated-looking threat in the AI ecosystem was beaten by the oldest trick in the supply chain book.

The irony is that LiteLLM, a tool purpose-built to manage access to AI systems, became the delivery vehicle for an attack that had nothing to do with AI at all. It was just a package. With dependencies. In a pipeline. Like everything else.

Pin your dependencies. Isolate your infra. And maybe double-check which security scanners your CI/CD is pulling.

LiteLLM Supply Chain Attack: The AI Package That Turned Into a Fork Bomb (But Stole Your AWS Keys First)

What Even Is LiteLLM?

How the Attack Got In (This Part Is Wild)

The Two Payloads

What the Malware Actually Did

The Bug That Saved People (Accidentally)

The Disclosure: Community 1, Attackers 0

This Wasn't a One-Off. It Was Phase 09.

What This Means If You Run AI Tooling (Read: Probably You)

What You Should Actually Do

The Thing That Sticks With Me

Comments

More from this blog

Hardening the Cluster: Implementing User Namespaces for Container Isolation

The Kubernetes Control Plane You Don't Own

Your GitOps Pipeline is a Lie Until You Prove Otherwise

When CoreDNS Falls Silent : A Kubernetes DNS Disaster Story & The Playbook That Saved Us

Argo Rollouts in Production: Canary, AnalysisTemplates, and the Gotchas Nobody Documents

Command Palette

What Even Is LiteLLM?

How the Attack Got In (This Part Is Wild)

The Two Payloads

What the Malware Actually Did

The Bug That Saved People (Accidentally)

The Disclosure: Community 1, Attackers 0

This Wasn't a One-Off. It Was Phase 09.

What This Means If You Run AI Tooling (Read: Probably You)

What You Should Actually Do

The Thing That Sticks With Me

Comments

More from this blog