In February 2021, security researcher Alex Birsan published a paper describing a novel supply chain attack technique he called "dependency confusion." Within weeks, it had been successfully used to penetrate the internal networks of Apple, Microsoft, Shopify, PayPal, and dozens of other organizations — without exploiting a single CVE. The attack was based entirely on a design assumption most package managers make that turns out to be exploitable.
Understanding how dependency confusion works is now a baseline for any engineering team using npm, pip, RubyGems, or NuGet — which is essentially every engineering team. Here's the mechanism, why it's still relevant, and what your pipeline needs to address it.
How the attack works
Most large organizations use a mix of public packages from registries like npmjs.org or PyPI, and internal private packages hosted on their own artifact registry (Artifactory, AWS CodeArtifact, Azure Artifacts, and so on). The private packages typically have names that are only meaningful internally — something like @mycompany/auth-utils or mycompany-internal-sdk.
The vulnerability lies in how many package managers handle name resolution when both a public registry and a private registry are configured. When a developer runs npm install or pip install, the package manager may check the public registry as well as the private one. If a package name exists on the public registry with a version number higher than the private one, many configurations will prefer the public version — because it appears to be "newer."
Birsan's technique was to register package names matching organizations' known internal package names on public registries, with version numbers higher than any plausible internal version (e.g., 9.9.9). When those organizations' build systems ran dependency installation, they fetched Birsan's public packages instead of their own internal ones. His packages contained only research disclosure code — but a malicious actor would publish code that exfiltrates environment variables, executes a reverse shell, or installs a backdoor.
The Codecov bash uploader incident (April 2021, CVE not assigned as it was a supply chain compromise rather than a software CVE) was different in mechanism but the same in lesson: the build-time supply chain is an attack surface that most CI/CD pipelines were not designed to defend.
Why this is still a live threat
It's tempting to read dependency confusion as a 2021 problem that's been patched. The package managers have added features. Some registries have implemented namespace squatting prevention. But the underlying issue — that organizations don't have full inventory of what packages their build systems resolve and from where — remains common.
A 2024 analysis of major npm packages found that a meaningful percentage of large organizations' dependency trees include packages without integrity pinning in their lockfiles — meaning the package manager will accept any matching version from any configured registry. Typosquatting (registering reqeusts instead of requests) remains an active technique; PyPI regularly removes malicious packages that have accumulated thousands of downloads before detection.
The threat surface has also expanded. Docker base images introduce the same class of risk. A FROM node:18-alpine in your Dockerfile trusts Docker Hub's resolution logic. If Docker Hub's copy of that tag has been replaced — through a compromised push or a tag overwrite — your build is compromised. This is why image digest pinning (FROM node@sha256:...) matters more than tag pinning.
What your pipeline needs to address this
The controls fall into three layers: registry configuration, lockfile integrity, and build-time verification.
Registry configuration. If you use private packages, your package manager configuration must explicitly scope private package names to your private registry and not fall back to public registries for those scopes. In npm, this means .npmrc with explicit scope-to-registry mappings. In pip, this means --index-url pointing to your private registry for internal packages, not --extra-index-url which adds the public index as a fallback. The distinction between --index-url and --extra-index-url is exactly the attack surface dependency confusion exploits.
Lockfile integrity. Commit your lockfiles. Every package-lock.json, yarn.lock, poetry.lock, or Pipfile.lock should be in version control and should be the source of truth for your CI builds. Running npm ci instead of npm install in CI enforces that the lockfile is used exactly — no resolution against registries, no silent upgrades. If your CI builds are doing npm install, you're not getting the protection the lockfile provides.
Build-time verification. Tools like Snyk, OWASP Dependency-Check, and GitHub's Dependabot can audit your dependency tree against known malicious packages and known vulnerabilities. These should run in CI as a blocking gate for critical findings. For Docker images specifically, tools like Trivy and Grype can scan image layers for both CVEs and known malicious packages.
The private package namespace problem
If your organization publishes private packages on internal registries, those package names should also be registered on public registries — as empty packages or placeholder packages that prevent a malicious actor from claiming them. This is namespace squatting in your own defense. It's not a perfect solution, but it closes the attack vector for the most obvious case.
GitHub's dependency review action can also flag when a PR introduces a dependency that has changed source registry between the base branch and the PR — a signal that a dependency confusion attack may have occurred.
Honest assessment of the residual risk
We're not saying these controls provide complete protection. A sufficiently sophisticated attacker who gains access to a widely-used public package (through a maintainer account compromise, or by waiting for an unmaintained package to be abandoned and then claiming it) can still inject malicious code into your dependency tree. The XZ Utils backdoor (CVE-2024-3094) — inserted by a social engineering campaign over two years — illustrates that supply chain attacks can be patient and sophisticated enough to evade most automated detection.
The realistic goal is to close the straightforward attacks (dependency confusion, typosquatting, unpinned Docker images) through configuration hygiene and automated scanning, and to reduce the blast radius of sophisticated attacks by running builds in minimal-permission environments and monitoring for unexpected outbound network connections from build processes.
The engineering team who hasn't audited their .npmrc configuration, whose CI runs npm install without lockfile enforcement, and whose Docker images use floating tags — that team is exposed to the straightforward attacks that have been successfully deployed against major organizations. That's the baseline to fix first.