What we found

PyPI keeps showing up in supply chain attacks, usually as the place malicious packages get published. That made us curious about the credentials already sitting in the open. GitGuardian's Public Monitoring scans public sources for leaked secrets, so we pulled the PyPI tokens it had surfaced and started digging.

On GitHub alone we counted 19,586 occurrences, which collapsed to 4,869 unique strings once we deduplicated. A smaller batch of 53 tokens came from public Docker Hub images.

Not every match is a real token. Some are visually malformed, padded with placeholder text like this one:

pypi-EXAMPLETOKEN[..]

After dropping the malformed strings and the tokens scoped to test.pypi.org rather than pypi.org, we were left with 3,714 unique pypi.org tokens worth analyzing.

Reading a token without using it

PyPI API tokens are macaroons. A macaroon is a bearer token that carries its own restrictions, called caveats, baked into the token itself. You can decode one without the signing secret and read what is inside. We used the Python module pypitoken to do exactly that.

Each decoded token gives you three useful things: the location it targets (either test.pypi.org or pypi.org), a unique token identifier, and the list of restrictions that scope what the token is allowed to do. The location is also what let us filter the set down to pypi.org earlier.

Those restrictions are the interesting part. They came in four flavors across our pypi.org set (a small number did not parse cleanly into any category):

  • 2,444 scoped to a user (UserIDRestriction)
  • 740 scoped to specific project IDs (ProjectIDsRestriction)
  • 339 with no real restriction (LegacyNoopRestriction)
  • 168 scoped to legacy project names (LegacyProjectNamesRestriction)

A project-scoped token is a small intelligence leak on its own. Even before you know whether the token works, it tells you which project it was meant for.

Checking which tokens were still live

Decoding tells you what a token claims. It does not tell you whether it still works. To check that safely, we mimicked what the twine upload command does and sent a deliberately broken package to the PyPI API. The HTTP response codes were enough to classify each token without ever publishing anything:

  • A 403 means the credentials are invalid.
  • A 400 means the credentials worked, but the file we sent was not a valid tarball.

So a 400 means the token is still valid. As a bonus, the API error messages sometimes leak the account login tied to the token, which later helped map tokens back to users.

By this method, 62 tokens came back valid for pypi.org: 80% from GitHub and there from Docker Hub. The GitHub count surprised us. Since 2023, GitHub scans public repositories for leaked PyPI tokens and notifies PyPI to revoke them automatically. In theory, almost no valid PyPI token should survive on GitHub.

Practice is different. Using GitHub metadata, we traced the first-leak date for each of the 61 GitHub tokens. The distribution mixes old and recent leaks, and most were first exposed in 2024:

First leaked Valid tokens
2021 9
2022 1
2024 49
2025 2

These tokens all leaked on GitHub, where scanning should have caught them. We do not yet know why so many survived, and especially why 2024 dominates. Our leading hypothesis is a gap in GitHub's scanning coverage, for example leaks in file locations or formats the scanner does not inspect. If you have a better explanation, we would like to hear it.

Responsible disclosure

Before contacting PyPI, we wanted to size up the blast radius from the outside. Using the project restrictions, we mapped 9 valid tokens back to 9 specific PyPI packages. With ClickPy, we estimated they account for roughly 10,000 downloads per month. We then investigated the GitHub repositories that contained the leaked tokens and identified 7 more packages, adding about 3,000 monthly downloads. That gave us 16 packages before PyPI looked at its own records.

A valid PyPI token is exactly the kind of credential that fuels a supply chain attack, so we reported the findings to the PyPI security team at security@pypi.org. We first shared the relevant token identifiers, then walked through the data together on a call.

The exchange both confirmed the impact and widened the blast radius. PyPI's own records showed 62 of the tokens still valid, tied through user roles to 125 live projects with around 25,000 monthly downloads. Our outside view had caught 16 of those packages; their internal view revealed the full 125.

The PyPI security team invalidated the tokens and sent notices to the affected users with this note: "Leaked credential identified via GitGuardian; proactively revoked." They also built new admin tooling to simplify future disclosures, shipped in warehouse PR #20133.

Takeaways

A leaked PyPI token is not an abstract risk. Out of thousands of exposed strings, 62 were live keys to 125 real projects with thousands of monthly downloads, and any one of them could have been used to push a malicious release.

For scale, 25,000 monthly downloads is modest next to the npm worms that reach hundreds of millions, but it is in line with or larger than most documented PyPI compromises, which often involve a few hundred to a few thousand downloads. PyPI simply runs at a smaller scale than npm, so the raw count understates the risk.

Several habits would have caught most of this earlier. Scan your own commits and images for secrets before they reach a public repository. Scope your PyPI tokens to a single project so a leak limits the blast radius instead of handing over a whole account. Make sure that sensitive files such as .pypirc and .env are ignored by git.

From a research perspective, the results show why it pays to challenge assumptions. We found valid PyPI tokens sitting on GitHub, where they were supposed to be detected and revoked automatically. The safeguards are real, but they are not airtight.