[dvlug] Tracking users via hashed URLs

Michael Paoli Michael.Paoli at cal.berkeley.edu
Tue Nov 16 08:12:20 PST 2021


> From: "Rick Moen" <rick at linuxmafia.com>
> Subject: [dvlug] Tracking users via hashed URLs
> Date: Mon, 15 Nov 2021 21:30:03 -0800

> There's been a strange element in Grant's recent postings to the
> "list at dvlug.org" ostensible mailing list.  Did anyone else notice it?

Yep, noticed ... and it's annoying.

>   Some have within the power to recall.
>   https://web.archive.org/web/*/http://dvlug.org/
>
> Grant replied, quoting the above, except it came out like this:
>
>   > Some have within the power to recall.
>   >  
> http://email.dvlug.org/c/eJw1jssKgzAURL8m2VX0Jpq4yMJaq1Ao7UJxG2N84QsrSv--obVwmMUwcKYUjDJS4laEAYKzTLb7kieR96ZpGigJPGfjrYMwi56sg26fEbkY_C0evSR7xP5QI2oPsu2t-ptqGnAjHK0dZbNSVxWlhDiyAsqpW4BmIIFxPAjic5d4QHEvmnWdX4iYA1fDrgtLLqppN21NS_1rTOJFHPVhPP1XxvkBR4I9Dw*/http://dvlug.org/
>
> Hey?  What's up with the rewritten URL with the huge hash string in it?
>
> One of their pages about their key value proposition spells it out:
>
>   ANALYTICS
>
> Getting to the main point, it's about _tracking_.

There's another huge issue with that, besides the tracking.

All those rewritten URLs?  They only continue to work so long as
the tracking provider continues to support 'em.

So, let's see, e.g. ...:
... well, I'm just getting 404 - let me see if I can get slightly
unmangled back to one of the tracking URLs ... dang, still 404 ...
how far back do I have to go? ...
okay, with a more freshly mangled and original or closer to an original
tracker mangling:
$ curl -I  
'http://email.dvlug.org/c/eJwdjsFOwzAQRL8mPlplHdvNwYe0lAYhITi06g1t7G3i1HEiE1L17zFITzOaOT1ndKmFY97s6wJ22Kzv6dIc1KM8nWqLsL3o-DbA_nz41AMM97kQz5lqPUbVnD-O1dgV5WZEH3j3n3YaWW-uLSmEJwkAiFI6JZW-Sq02LThLpNloRLWVQkHJgumXZf4uRBZ4yVDkd3_zMzmPfEpdvv52rte4UIq0fNXJ9n4llszobY8U-IxT8NnEYuAtpRsFenByP7-l-Ea9'
HTTP/1.1 302 FOUND
Content-Length: 433
Content-Type: text/html; charset=utf-8
Date: Tue, 16 Nov 2021 15:56:06 GMT
Location: https://en.wikipedia.org/wiki/Internet_Archive
Server: nginx
X-Robots-Tag: noindex
Connection: keep-alive

$ dig +noall +answer +nottl email.dvlug.org. CNAME
email.dvlug.org.        IN      CNAME   mailgun.org.
$
So, such URLs will only continue to work so long as mailgun continues
to hold and serve that data ... and continues to exist.  Mailgun goes
belly up and all that data goes bye-bye?  Then all those URLs become
essentially unusable garbage.  Some more nefarious operation buys
up the domain mailgun.org ... then just think where all those URLs
may go.  So, yeah, you're at the mercy of mailgun.org ...
in perpetuity.

So ... oh, and a free archiving service for it ... *after* mailgun
does the mangling/tracking.  Uh huh.  So there is no unadulterated
archive for that list.  So ... y'all backing up that data, right?
Including where all those URLs in the archive redirect too, so you
can later unmangle them?  Yeah, I didn't think so.  Huh.

> Any mail that gets sent out as part of a Mailgun "campaign", the target
> market being bulk mailings for sales/marketing, that is issued by
> Mailgun's software (as Grant did) gets all URLs within the mailing's
> body text _altered_ to be individual to each user.  Thus the long hash
> value:  The hash is a marker that is unique to the subscriber.   If two
> list at dvlug.org "subscribers" compared copies each received from
> list at dvlug.org, their hashes will differ.
>
> The idea is that, any time a user visits a URL within a
> Mailgun-originated mail, the page is loaded via HTTP fetch from the
> Mailgun site with HTTP 302 redirect to the destination page.  The
> intermediate visit to the Mailgun site is registered in the Mailgun,
> Inc. logs as having been prompted by _this_ "campaign" e-mail, and
> specifically the copy sent to _this_ user.  If the user sends the
> "campaign" e-mail to a bunch of friends and they load the URL, then that

> You encounter these pestilential bits of e-mail surveillance practically
> everywhere, these days, and it's rare for people to get much worked up
> over them, because they're sort of a background annoyance of badness.

Well, not only tons of mail (dis)services that do such tracking and
the like, but tons of web sites (ah, but generally not on
[L]UG web sites).  E.g. most (anti-)"social" web sites or similar
components to many/most large commercial web sites - e.g. most
"news" sites ... got URLs on there?  Often they've got tracking URLs
to track referrals, etc.  And yes, of course sites like Farcebook,
or meta, or whatever the hell it's trying to rebrand itself as these
days 'cause they're trying to run from their horrible reputation.
(I think Icelandverse is much better:
https://youtu.be/enMwwQy_noI
)

> You know where you ordinarily never encounter them, though?  Linux user
> groups.

Uhm, ... or maybe outsource the backup of the tracker URL redirects
to another free service?
$ curl -s http://email.dvlug.org/robots.txt
User-agent: *
Disallow: /
$
Nope ... at least not to any legitimate archivers like archive.org.
So ... mailgun goes byebye, it'll mostly be forgotten.




More information about the dvlug mailing list