[conspire] Piping, redirection and shellscipts: 3/5/2025 7pm Eastern Standard time

Mon Mar 3 14:56:12 PST 2025

Michael Paoli wrote on 2025-03-03 09:07:

>>> This sounds straightforward
>>> 
>> No, it does*not* sound straightforward!
>> 
> Uhm, sounds pretty straightforward(ish) to me.

That "ish" is doing some heavy lifting...

>> The list of all domain info should be in a DB or at least a text 
>> file, for starters.
>> 
> Ah, if only.  In many cases the data is in non-ideal forms/sources, 
> and in many cases we have little to no choice about it.

This appears to be a list of domains that have been registered and are
for sale (aka domain squatting?), so I'm not convinced that pulling from
multiple registrars isn't a choice.

Sure, pulling someone *else's* data from a web site is sometimes
necessary, but I don't think this is that.

>> From there, do the sort(s), build the YAML, etc. Why sort a YAML 
>> file?!? Running `diff` and manually merging?
>> 
> Typically there are better (semi-)automated ways, e.g.: $ comm -23 
> <(sort -u < file1) <(sort -u < file2) That, among multiple possible 
> ways, will give exactly once, each unique line present in file1
> that isn't present in file2.

That's a neat trick; I'll try to remember that!

>> All this scraping seems pointless, why is the source of this data 
>> inside web pages?
>> 
> Often the format/source isn't a choice, but what one needs deal 
> with.

Again, pretty sure the list of "domains I have registered, their expiry
dates, and descriptions" could have a better authoritative source, 
especially since the selling prices info won't be stored at the registrar.

Maybe there's a good explanation, but I remain doubtful.

Also, a lot of registrars will have API access to perform such lookups. 
That may (or may not) be an option here.