[sf-lug] Malware on PyPI repository

Rick Moen rick at linuxmafia.com
Sun Dec 5 18:59:46 PST 2021

Quoting Akkana Peck (akkana at shallowsky.com):

> That's correct. It has about the same security as installing from a
> github repo (i.e. basically none), but it's a lot easier for the user.
> Packages on PyPI are signed, but that by itself doesn't tell you
> anything since anyone can create a GPG key and sign a package.

Better than nothing -- in that it at least foils tampering by third
parties (such as someone who's compromised security on the hosting
site), and, if you get hosed, at least you can trace who (or at least
what GPG keyholder persona) hosed you.

So, it doesn't prevent getting hosed by Moriarty, the Napoleon of Crime,
but _does_ avert getting hax0red by Moriarty's brother Fredo.

> Unfortunately, a virtualenv doesn't protect your system at all.
> It's not a chroot or anything like that, just a set of environment
> variables defining things like PYTHONPATH.

Yes, I merely meant code running in the venv is isolated to that degree
from the system -- more a protection against mishap than against malice.

> I suspect that if someone managed to upload malware to PyPI as part
> of a well known project, it would get noticed pretty quickly. So if
> you're getting, say, flask or matplotlib from PyPI, you're probably
> pretty safe. On the other hand, if you're downloading and running
> "10Cent11" or "importantpackage" without doing any research on them
> ... well, not so safe.

I liked very much the JFrog blog comments you pointed to, pointing to
the malign uses of typosquatting, e.g., relying on dodgy and invasive
package "distutil" getting confused with the well-known package
"distutils".  This is a recurring subvariety of social engineering,
e.g., http://linuxmafia.com/~rick/lexicon.html#frogery .

> > effectively uncurated (or loosely curated) "bazaar" code hosting sites,
> > e.g., the older instantiation of addons.mozilla.org (before Mozilla,
> > Inc. cracked down on the dangerous chaos there), Gnome-look.org, and
> Did they? I thought only plugins marked "recommended" had been
> curated, and that anyone could still submit a plugin.

I haven't followed closely what a.m.o (addons.mozilla.org) has been
doing since the company's shocking decision that Firefox would no longer
run non-Mozilla-signed extensions.  That certainly fixed one problem for
the corporation, but in my view was open-source-hostile and, along with
other things, has largely impelled me to look elsewhere.  

This signed-by-us-only policy still means that, yes, anyone may still
submit an extension, and _if_ Mozilla, Inc. is willing to sign that
code, then it will be made available at a.m.o unless/until Mozilla, Inc.
removes it -- but also means that nobody may run an extension in recent
versions of Firefox that Mozilla, Inc. hasn't literally signed off on.
I considered that policy change shocking, and a unilateral abridgement
of user freedom that ought to make users of open source code wary, to
say the least.  If I am allowed to run only what corporate management
approves of, then it's no longer really open source.

> Right. Sometimes you want/need something that isn't in distro repos,
> but when doing so, always be conscious of the risks.

I tried to articulate the nuances on that problem years ago, when I was
one of _Linux Gazette's_ main editors , at 
http://linuxmafia.com/~rick/weatherwax.html#1 :

  [1] Rick Moen comments: While it's useful and worthwhile to know about
  a program's "upstream" development site, where (among other things) the
  author's latest source code can be downloaded, there are a few
  disadvantages that should be noted (and some alternative locations that
  should be usually be preferred, instead, if such are findable):

  1. Absent extraordinary measures on your part, your Linux distribution's
  package-tracking system won't know about the program's presence on your
  system. Therefore, it won't know to avoid installing conflicting
  programs, removing libraries it depends on, etc.

  2. You won't get any tweaks and enhancements that may be normal (or
  necessary!) for applications on your Linux distribution — unless you
  yourself implement them. You won't get security patches, either, except
  those written by the upstream author.

  3. Along those same lines, the desirable version to compile and run may
  well not be the author's latest release: Sometimes, authors are trying
  out new concepts, and improvements & old bugs fixed are outweighed by
  misfeatures & new bugs introduced.

  4. As a person downloading the upstream author's source code directly,
  you have to personally assume the burden of verifying that the tarball
  really is the author's work, and not that of (e.g.) a network intruder
  who cracked the download ftp site substituted a trojaned version.
  Although this concern applies mostly to software designed to run with
  elevated privilege, it's not a strictly academic risk: Linux-relevant
  codebases that have been (briefly) trojaned in this fashion, in recent
  years, on the upstream author's download sites, include Wietse Venema's
  TCP Wrappers (tcpd/libwrap), the util-linux package, sendmail, OpenSSH,
  and the Linux kernel (CVS gateway's archive, only). Unless you are
  prepared to meaningfully verify the author's cryptographic signature —
  if any — on that tarball, you risk sabotaging your system's security.
  (None of those upstream trojanings escaped into Linux distributions
  because of distribution packager vigilance. Make sure you can be as good
  a gatekeeper, or rely on those who already do the job well.)

  All of the above are problems normally addressed (and the burden of
  solving them, shouldered) by Linux distributions' package maintainers,
  so that you won't have to. It's to your advantage to take advantage of
  that effort, if feasible. The memory of when a thousand Linux sysadmins,
  circa 1993, would need to do all of that work 999-times redundantly, is
  still fresh to us old-timers: We call those the Bad Old Days, given that
  today one expert package maintainer can instead do that task for a
  thousand sysadmins. And yes, sometimes there's nothing like such a
  package available, and you have no reasonable alternative but to grab
  upstream source tarballs — but the disadvantages justify some pains to
  search for suitable packages, instead.

  Depending on your distribution, you may find that there are update
  packages available directly from the distribution's package updating
  utilities, or from ancillary, semi-official package archives (e.g., the
  Fedora Extras and "dag" repositories for Fedora/RH and similar
  distributions), or, failing that, third-party packages maintained by
  reputable outside parties, e.g., some of the Debian-and-compatible
  repositories registered at the apt-get.org and backports.org sites.
  Although those are certainly not unfailingly better than tarballs, I
  would say they're generally so.

  The smaller, less popular, and less dependency-ridden a package is, the
  more you might be tempted to use an upstream source tarball. For
  example, I use locally compiled versions of the Leafnode pre-2.0 betas
  to run my server's local NNTP newsgroups, because release-version
  packages simply lack that functionality altogether. On the other hand,
  that package's one dependency, the Perl PCRE library, I satisfy from my
  distribution's official packages, for all the reasons stated above.

  [RM's post-publication addendum: I should have also mentioned that
  package-oriented distributions generally also have simple toolsets to
  craft local packages from tarballs when that is the best option for
  whatever reason, rounding out my listing of ways to avoid the lingering
  menace of unpackaged system software on software architectures with
  good, useful package management. (Linux distributions that either
  deliberately or otherwise lack software & configuration management using
  package management tools are, obviously, a separate topic, not addressed

  [And, in case it wasn't obvious, binary tarballs from upstream
  distribution sites aren't siginficantly better than source ones. Either
  way, you get software your distro package regime knows nothing about,
  that won't get maintenance updates, that isn't built for your distro
  specifically, that poses code-authentication challenges you probably
  aren't equipped to shoulder, that doesn't benefit from
  distro-package-maintainer quality control, and that you might not even
  be able to figure out how to cleanly remove. All you gain with binary
  tarballs over source tarballs is avoiding the need to compile locally.]

More information about the sf-lug mailing list