[sf-lug] Malware on PyPI repository
rick at linuxmafia.com
Sun Dec 5 18:59:46 PST 2021
Quoting Akkana Peck (akkana at shallowsky.com):
> That's correct. It has about the same security as installing from a
> github repo (i.e. basically none), but it's a lot easier for the user.
> Packages on PyPI are signed, but that by itself doesn't tell you
> anything since anyone can create a GPG key and sign a package.
Better than nothing -- in that it at least foils tampering by third
parties (such as someone who's compromised security on the hosting
site), and, if you get hosed, at least you can trace who (or at least
what GPG keyholder persona) hosed you.
So, it doesn't prevent getting hosed by Moriarty, the Napoleon of Crime,
but _does_ avert getting hax0red by Moriarty's brother Fredo.
> Unfortunately, a virtualenv doesn't protect your system at all.
> It's not a chroot or anything like that, just a set of environment
> variables defining things like PYTHONPATH.
Yes, I merely meant code running in the venv is isolated to that degree
from the system -- more a protection against mishap than against malice.
> I suspect that if someone managed to upload malware to PyPI as part
> of a well known project, it would get noticed pretty quickly. So if
> you're getting, say, flask or matplotlib from PyPI, you're probably
> pretty safe. On the other hand, if you're downloading and running
> "10Cent11" or "importantpackage" without doing any research on them
> ... well, not so safe.
I liked very much the JFrog blog comments you pointed to, pointing to
the malign uses of typosquatting, e.g., relying on dodgy and invasive
package "distutil" getting confused with the well-known package
"distutils". This is a recurring subvariety of social engineering,
e.g., http://linuxmafia.com/~rick/lexicon.html#frogery .
> > effectively uncurated (or loosely curated) "bazaar" code hosting sites,
> > e.g., the older instantiation of addons.mozilla.org (before Mozilla,
> > Inc. cracked down on the dangerous chaos there), Gnome-look.org, and
> Did they? I thought only plugins marked "recommended" had been
> curated, and that anyone could still submit a plugin.
I haven't followed closely what a.m.o (addons.mozilla.org) has been
doing since the company's shocking decision that Firefox would no longer
run non-Mozilla-signed extensions. That certainly fixed one problem for
the corporation, but in my view was open-source-hostile and, along with
other things, has largely impelled me to look elsewhere.
This signed-by-us-only policy still means that, yes, anyone may still
submit an extension, and _if_ Mozilla, Inc. is willing to sign that
code, then it will be made available at a.m.o unless/until Mozilla, Inc.
removes it -- but also means that nobody may run an extension in recent
versions of Firefox that Mozilla, Inc. hasn't literally signed off on.
I considered that policy change shocking, and a unilateral abridgement
of user freedom that ought to make users of open source code wary, to
say the least. If I am allowed to run only what corporate management
approves of, then it's no longer really open source.
> Right. Sometimes you want/need something that isn't in distro repos,
> but when doing so, always be conscious of the risks.
I tried to articulate the nuances on that problem years ago, when I was
one of _Linux Gazette's_ main editors , at
 Rick Moen comments: While it's useful and worthwhile to know about
a program's "upstream" development site, where (among other things) the
author's latest source code can be downloaded, there are a few
disadvantages that should be noted (and some alternative locations that
should be usually be preferred, instead, if such are findable):
1. Absent extraordinary measures on your part, your Linux distribution's
package-tracking system won't know about the program's presence on your
system. Therefore, it won't know to avoid installing conflicting
programs, removing libraries it depends on, etc.
2. You won't get any tweaks and enhancements that may be normal (or
necessary!) for applications on your Linux distribution — unless you
yourself implement them. You won't get security patches, either, except
those written by the upstream author.
3. Along those same lines, the desirable version to compile and run may
well not be the author's latest release: Sometimes, authors are trying
out new concepts, and improvements & old bugs fixed are outweighed by
misfeatures & new bugs introduced.
4. As a person downloading the upstream author's source code directly,
you have to personally assume the burden of verifying that the tarball
really is the author's work, and not that of (e.g.) a network intruder
who cracked the download ftp site substituted a trojaned version.
Although this concern applies mostly to software designed to run with
elevated privilege, it's not a strictly academic risk: Linux-relevant
codebases that have been (briefly) trojaned in this fashion, in recent
years, on the upstream author's download sites, include Wietse Venema's
TCP Wrappers (tcpd/libwrap), the util-linux package, sendmail, OpenSSH,
and the Linux kernel (CVS gateway's archive, only). Unless you are
prepared to meaningfully verify the author's cryptographic signature —
if any — on that tarball, you risk sabotaging your system's security.
(None of those upstream trojanings escaped into Linux distributions
because of distribution packager vigilance. Make sure you can be as good
a gatekeeper, or rely on those who already do the job well.)
All of the above are problems normally addressed (and the burden of
solving them, shouldered) by Linux distributions' package maintainers,
so that you won't have to. It's to your advantage to take advantage of
that effort, if feasible. The memory of when a thousand Linux sysadmins,
circa 1993, would need to do all of that work 999-times redundantly, is
still fresh to us old-timers: We call those the Bad Old Days, given that
today one expert package maintainer can instead do that task for a
thousand sysadmins. And yes, sometimes there's nothing like such a
package available, and you have no reasonable alternative but to grab
upstream source tarballs — but the disadvantages justify some pains to
search for suitable packages, instead.
Depending on your distribution, you may find that there are update
packages available directly from the distribution's package updating
utilities, or from ancillary, semi-official package archives (e.g., the
Fedora Extras and "dag" repositories for Fedora/RH and similar
distributions), or, failing that, third-party packages maintained by
reputable outside parties, e.g., some of the Debian-and-compatible
repositories registered at the apt-get.org and backports.org sites.
Although those are certainly not unfailingly better than tarballs, I
would say they're generally so.
The smaller, less popular, and less dependency-ridden a package is, the
more you might be tempted to use an upstream source tarball. For
example, I use locally compiled versions of the Leafnode pre-2.0 betas
to run my server's local NNTP newsgroups, because release-version
packages simply lack that functionality altogether. On the other hand,
that package's one dependency, the Perl PCRE library, I satisfy from my
distribution's official packages, for all the reasons stated above.
[RM's post-publication addendum: I should have also mentioned that
package-oriented distributions generally also have simple toolsets to
craft local packages from tarballs when that is the best option for
whatever reason, rounding out my listing of ways to avoid the lingering
menace of unpackaged system software on software architectures with
good, useful package management. (Linux distributions that either
deliberately or otherwise lack software & configuration management using
package management tools are, obviously, a separate topic, not addressed
[And, in case it wasn't obvious, binary tarballs from upstream
distribution sites aren't siginficantly better than source ones. Either
way, you get software your distro package regime knows nothing about,
that won't get maintenance updates, that isn't built for your distro
specifically, that poses code-authentication challenges you probably
aren't equipped to shoulder, that doesn't benefit from
distro-package-maintainer quality control, and that you might not even
be able to figure out how to cleanly remove. All you gain with binary
tarballs over source tarballs is avoiding the need to compile locally.]
More information about the sf-lug