[conspire] People failing to learn about package gatekeeping, part 1

Sun Apr 17 12:21:32 PDT 2022

Deliberately circumventing your distro's package regime is reckless.
People keep learning that the hard way (or, worse, never figuring it
out.

Here, one of innumerable examples of the "circumvent the package regime" 
questionable idea:  PyPI (Python Package Index).  

CPAN, RubyGems, npm, Composer, PEAR, PyPI, NuGet, addons.mozilla.org,
gnome-look.org, Ubuntu PPAs, and upstream maintainer sites cited in
passing, ditto.  All deliberately circumvent the orderliness and
protection of your package regime, to your detriment, and should be
used, if at all, with reluctance, caution, _and expertise_.

Or, tl;dr:  if you're not absolutely certain you know what you're doing,
just don't.

----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Sun, 5 Dec 2021 16:01:34 -0800
From: Rick Moen <rick at linuxmafia.com>
To: sf-lug at linuxmafia.com
Subject: Re: [sf-lug] Malware on PyPI repository
Organization: If you lived here, you'd be $HOME already.

Quoting Akkana Peck (akkana at shallowsky.com):

[About https://arstechnica.com/information-technology/2021/11/malware-downloaded-from-pypi-41000-times-was-surprisingly-stealthy/ :]

> It always amazes me how bad articles about malware are. In this case,
> not bothering to mention the names of the packages except for two.

IMO, one significant if not dominant reason for this prevailing badness
is that most articles about malware are slightly adapted press releases
from antimalware companies.  The latter have zero incentive to help
end-users correctly understand security.  To the contrary, their best
interest is served by stirring up users about alleged and real threats
_without_ understanding them, therefore motivating them to buy
antimalware companies' goods and services.  And, with some honourable
exceptions, that's the way they express themselves, which is then
reflected very strongly in the IT press coverage that cribs from them
(which amounts to almost all IT press coverage).

> If you want to see the list of dangerous packages without sifting
> through all the comments to find it, it's at
> https://jfrog.com/blog/python-malware-imitates-signed-pypi-traffic-in-novel-exfiltration-technique/

Basically:  Details the stealthy measures taken to do mischief taken by
11 Python code offerings if one made the error of installing them on a
"what could possibly go wrong?" basis from the third-party "PyPI" code
repository.

IMO, though, that isn't very interesting.  Anyone who installs and
executes untrustworthy code knows -- or learns really quickly -- that
the code can carry out any action its user authority permits.

In my view, it's more interesting to back away and consider larger
context:  What/who is PyPI?  What makes its operation trustworthy or
not?  Is (or was) there meaningful vetting of code contributors and of
what they submit?  If there wasn't (and the smart money is on "there
wasn't"), have they now learned anything?  And why would members of the
general public be sourcing their code from PyPI in the first place?

PyPI Project's self-description:

  The Python Package Index (PyPI) is a repository of software for the
  Python programming language.  PyPI helps you find and install software 
  developed and shared by the Python community.

Looking at the FAQ, I see that this is a means for people on any OS to
circumvent their distro protections (if any) to grab and install Python
interpreted code from a large number of Python coders and add it to a
real system using Python's "pip" installer tool.  Ergo, among other
things, your system package regime (deb, rpm, whatever) won't know
anything at all about what you grab from PyPI.

Some means are furnished to limit the inherent harm doing this creates
to the target system, notably the option to install code from PyPI into 
one of Python3's "venv" lightweight virtual environments, isolating it
somewhat from the system.  You can also decline to run any PyPI code
that wants to talk you into giving it elevated privilege via sudo or
otherwise.

Getting back to my questions:  Who are these guys?  Well, it's an
offshoot of Python Software Foundation.  It's like what CPAN is for
Perl, Gems for Ruby, npm for Javascript, Composer and PEAR for PHP,
NuGet for .NET.  Is (or was) there meaningful vetting of code
contributors and of what they submit?  Well, kind of no.  Judging by the
FAQ, you just contact them and say "Hi, I write cool stuff in Python and
wish to be a project owner on PyPI", and they make you one.  Dan Goodin
at ArsTechnica says:

  Use of open source repositories to push malware dates back to at least
  2016, when a college student uploaded malicious packages to PyPI,
  RubyGems, and npm.  He gave the packages names that were similar to
  widely used packages already submitted by other users.

So, very weak, functionally nil, vetting of new code maintainers and
also of what they submit.  And this really should not be even a tiny bit
surprising.  We've seen this sort of thing over and over and over, on
effectively uncurated (or loosely curated) "bazaar" code hosting sites,
e.g., the older instantiation of addons.mozilla.org (before Mozilla,
Inc. cracked down on the dangerous chaos there), Gnome-look.org, and
dozens of such places, where the "Grab code[1] from here and trust it"
model was ripe for abuse and got it in spades.

Have they learned anything?  Well, not as one might wish.  Or rather, 
looking at it much less harshly, they're apparently fine with being what
they are, as to the deliberately limited scope of what they provide.
Which is to say, they're willing to accept and act on security reports
that PyPI codebase [X] hax0red your system, by investigating and
removing [X] and mildly swatting its submitter, but that's the limit of
it -- a matter discussed in greater detail here:
https://security.stackexchange.com/questions/79326/which-security-measures-does-pypi-and-similar-third-party-software-repositories

So, takeaway lesson:  If you disregard the gatekeeping protection of
your distro package regime, and go nonchalantly grabbing things from
the likes of CPAN, RubyGems, npm, Composer, PEAR, PyPI, NuGet, or
addons.mozilla.org (even now), gnome-look.org, Ubuntu PPAs, or upsteam
maintainer sites, you are playing with fire and may get burned.  Unix
has provided the rope with which you can efficiently hang yourself, and
will not protect your neck from the harm you are imposing on it.

[1] In the case of Gnome-look.org, the site wasn't supposed to be
hosting code, only GNOME/GTK themes, artwork, icons, splash screens, and
screen savers,  but the bad guy discovered the site didn't prevent him
uploading executable trojan code (set to go off by installing a .deb
package), and he relied on GNOME users downloading his "screen saver" to
be too clueless to notice they were being asked to do something
reckless.  (GNOME screen savers aren't provided in .deb packages.)
https://lwn.net/Articles/367874/
https://www.linux-magazine.com/Online/News/Malicious-Screensaver-Malware-on-Gnome-Look.org

(Heh, notice on the LWN story's reader comments, I gave my take on the
problem then -- in 2009.)

----- End forwarded message -----
----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Sun, 5 Dec 2021 18:59:46 -0800
From: Rick Moen <rick at linuxmafia.com>
To: sf-lug at linuxmafia.com
Subject: Re: [sf-lug] Malware on PyPI repository
Organization: If you lived here, you'd be $HOME already.

Quoting Akkana Peck (akkana at shallowsky.com):

> That's correct. It has about the same security as installing from a
> github repo (i.e. basically none), but it's a lot easier for the user.
> 
> Packages on PyPI are signed, but that by itself doesn't tell you
> anything since anyone can create a GPG key and sign a package.

Better than nothing -- in that it at least foils tampering by third
parties (such as someone who's compromised security on the hosting
site), and, if you get hosed, at least you can trace who (or at least
what GPG keyholder persona) hosed you.

So, it doesn't prevent getting hosed by Moriarty, the Napoleon of Crime,
but _does_ avert getting hax0red by Moriarty's brother Fredo.

> Unfortunately, a virtualenv doesn't protect your system at all.
> It's not a chroot or anything like that, just a set of environment
> variables defining things like PYTHONPATH.

Yes, I merely meant code running in the venv is isolated to that degree
from the system -- more a protection against mishap than against malice.

> I suspect that if someone managed to upload malware to PyPI as part
> of a well known project, it would get noticed pretty quickly. So if
> you're getting, say, flask or matplotlib from PyPI, you're probably
> pretty safe. On the other hand, if you're downloading and running
> "10Cent11" or "importantpackage" without doing any research on them
> ... well, not so safe.

I liked very much the JFrog blog comments you pointed to, pointing to
the malign uses of typosquatting, e.g., relying on dodgy and invasive
package "distutil" getting confused with the well-known package
"distutils".  This is a recurring subvariety of social engineering,
e.g., http://linuxmafia.com/~rick/lexicon.html#frogery .

> > effectively uncurated (or loosely curated) "bazaar" code hosting sites,
> > e.g., the older instantiation of addons.mozilla.org (before Mozilla,
> > Inc. cracked down on the dangerous chaos there), Gnome-look.org, and
> 
> Did they? I thought only plugins marked "recommended" had been
> curated, and that anyone could still submit a plugin.

I haven't followed closely what a.m.o (addons.mozilla.org) has been
doing since the company's shocking decision that Firefox would no longer
run non-Mozilla-signed extensions.  That certainly fixed one problem for
the corporation, but in my view was open-source-hostile and, along with
other things, has largely impelled me to look elsewhere.  

This signed-by-us-only policy still means that, yes, anyone may still
submit an extension, and _if_ Mozilla, Inc. is willing to sign that
code, then it will be made available at addons.mozilla.org unless/until
Mozilla, Inc.  removes it -- but also means that nobody may run an
extension in recent versions of Firefox that Mozilla, Inc. hasn't
literally signed off on.  I considered that policy change shocking, and
a unilateral abridgement of user freedom that ought to make users of
open source code wary, to say the least.  If I am allowed to run only
what corporate management approves of, then it's no longer really open
source.

> Right. Sometimes you want/need something that isn't in distro repos,
> but when doing so, always be conscious of the risks.

I tried to articulate the nuances on that problem years ago, when I was
one of _Linux Gazette's_ main editors , at 
http://linuxmafia.com/~rick/weatherwax.html#1 :

  [1] Rick Moen comments: While it's useful and worthwhile to know about
  a program's "upstream" development site, where (among other things) the
  author's latest source code can be downloaded, there are a few
  disadvantages that should be noted (and some alternative locations that
  should be usually be preferred, instead, if such are findable):

  1. Absent extraordinary measures on your part, your Linux distribution's
  package-tracking system won't know about the program's presence on your
  system. Therefore, it won't know to avoid installing conflicting
  programs, removing libraries it depends on, etc.

  2. You won't get any tweaks and enhancements that may be normal (or
  necessary!) for applications on your Linux distribution — unless you
  yourself implement them. You won't get security patches, either, except
  those written by the upstream author.

  3. Along those same lines, the desirable version to compile and run may
  well not be the author's latest release: Sometimes, authors are trying
  out new concepts, and improvements & old bugs fixed are outweighed by
  misfeatures & new bugs introduced.

  4. As a person downloading the upstream author's source code directly,
  you have to personally assume the burden of verifying that the tarball
  really is the author's work, and not that of (e.g.) a network intruder
  who cracked the download ftp site substituted a trojaned version.
  Although this concern applies mostly to software designed to run with
  elevated privilege, it's not a strictly academic risk: Linux-relevant
  codebases that have been (briefly) trojaned in this fashion, in recent
  years, on the upstream author's download sites, include Wietse Venema's
  TCP Wrappers (tcpd/libwrap), the util-linux package, sendmail, OpenSSH,
  and the Linux kernel (CVS gateway's archive, only). Unless you are
  prepared to meaningfully verify the author's cryptographic signature —
  if any — on that tarball, you risk sabotaging your system's security.
  (None of those upstream trojanings escaped into Linux distributions
  because of distribution packager vigilance. Make sure you can be as good
  a gatekeeper, or rely on those who already do the job well.)

  All of the above are problems normally addressed (and the burden of
  solving them, shouldered) by Linux distributions' package maintainers,
  so that you won't have to. It's to your advantage to take advantage of
  that effort, if feasible. The memory of when a thousand Linux sysadmins,
  circa 1993, would need to do all of that work 999-times redundantly, is
  still fresh to us old-timers: We call those the Bad Old Days, given that
  today one expert package maintainer can instead do that task for a
  thousand sysadmins. And yes, sometimes there's nothing like such a
  package available, and you have no reasonable alternative but to grab
  upstream source tarballs — but the disadvantages justify some pains to
  search for suitable packages, instead.

  Depending on your distribution, you may find that there are update
  packages available directly from the distribution's package updating
  utilities, or from ancillary, semi-official package archives (e.g., the
  Fedora Extras and "dag" repositories for Fedora/RH and similar
  distributions), or, failing that, third-party packages maintained by
  reputable outside parties, e.g., some of the Debian-and-compatible
  repositories registered at the apt-get.org and backports.org sites.
  Although those are certainly not unfailingly better than tarballs, I
  would say they're generally so.

  The smaller, less popular, and less dependency-ridden a package is, the
  more you might be tempted to use an upstream source tarball. For
  example, I use locally compiled versions of the Leafnode pre-2.0 betas
  to run my server's local NNTP newsgroups, because release-version
  packages simply lack that functionality altogether. On the other hand,
  that package's one dependency, the Perl PCRE library, I satisfy from my
  distribution's official packages, for all the reasons stated above.

  [RM's post-publication addendum: I should have also mentioned that
  package-oriented distributions generally also have simple toolsets to
  craft local packages from tarballs when that is the best option for
  whatever reason, rounding out my listing of ways to avoid the lingering
  menace of unpackaged system software on software architectures with
  good, useful package management. (Linux distributions that either
  deliberately or otherwise lack software & configuration management using
  package management tools are, obviously, a separate topic, not addressed
  here.)]

  [And, in case it wasn't obvious, binary tarballs from upstream
  distribution sites aren't siginficantly better than source ones. Either
  way, you get software your distro package regime knows nothing about,
  that won't get maintenance updates, that isn't built for your distro
  specifically, that poses code-authentication challenges you probably
  aren't equipped to shoulder, that doesn't benefit from
  distro-package-maintainer quality control, and that you might not even
  be able to figure out how to cleanly remove. All you gain with binary
  tarballs over source tarballs is avoiding the need to compile locally.]

----- End forwarded message -----
----- Forwarded message from Rick Moen <rick at linuxmafia.com> -----

Date: Sun, 5 Dec 2021 18:30:54 -0800
From: Rick Moen <rick at linuxmafia.com>
To: sf-lug at linuxmafia.com
Subject: Re: [sf-lug] Malware on PyPI repository
Organization: If you lived here, you'd be $HOME already.

Quoting Bobbie Sellers (bliss-sf4ever at dslextreme.com):

> Well Rick I see the danger as the method used which might be extended to
> other platforms than PyPi. 

{sigh}

That'd take a lot of explaining.  I just got through detailing why the
core problem (or rather, limitation) is that PyPI does zero curating,
and so _if_ people are careless enough to trust (and download and
execute) any-old-thing from any-old-person just because they find it on
PyPI, they are likely to shoot at their feet.

> You are always saying that getting the malware downloaded to a machine
> is the weak point of all these threats and they appear to have gotten
> into the repository without problems.

First of all, that's _not_ the thing I'm always saying.

What you're probably thinking of is my frequent point that the only
interesting question about malware is how it gets executed (and, if
necessary, how it escalates privilege enough to do harm).  Not what you
understood me to have said, at all.

And, separately, I just got through explaining that the bad guys "got
into the repository" because PyPI / Software Python Foundation accepts
any Python project from anyone.  On present evidence, they do zero
vetting, only tossing projects / owners who've been found to do evil.

And, to repeat myself yet again, the easy/obvious way to avert these
risks is to lean on distro package maintainers as gatekeepers.  Only at
your peril, and with great caution, and if/when absolutely necessary, 
circumvent the distro package regime to source code from elsewhere --
and then be aware that _you_ must assume all the duties of being a
gatekeeper, and may shoot at your feet if you fail at those duties.

----- End forwarded message -----