[conspire] Piping, redirection and shellscipts: 3/5/2025 7pm Eastern Standard time
Ron
admin at bclug.ca
Mon Mar 3 14:56:12 PST 2025
Michael Paoli wrote on 2025-03-03 09:07:
>>> This sounds straightforward
>>>
>> No, it does*not* sound straightforward!
>>
> Uhm, sounds pretty straightforward(ish) to me.
That "ish" is doing some heavy lifting...
>> The list of all domain info should be in a DB or at least a text
>> file, for starters.
>>
> Ah, if only. In many cases the data is in non-ideal forms/sources,
> and in many cases we have little to no choice about it.
This appears to be a list of domains that have been registered and are
for sale (aka domain squatting?), so I'm not convinced that pulling from
multiple registrars isn't a choice.
Sure, pulling someone *else's* data from a web site is sometimes
necessary, but I don't think this is that.
>> From there, do the sort(s), build the YAML, etc. Why sort a YAML
>> file?!? Running `diff` and manually merging?
>>
> Typically there are better (semi-)automated ways, e.g.: $ comm -23
> <(sort -u < file1) <(sort -u < file2) That, among multiple possible
> ways, will give exactly once, each unique line present in file1
> that isn't present in file2.
That's a neat trick; I'll try to remember that!
>> All this scraping seems pointless, why is the source of this data
>> inside web pages?
>>
> Often the format/source isn't a choice, but what one needs deal
> with.
Again, pretty sure the list of "domains I have registered, their expiry
dates, and descriptions" could have a better authoritative source,
especially since the selling prices info won't be stored at the registrar.
Maybe there's a good explanation, but I remain doubtful.
Also, a lot of registrars will have API access to perform such lookups.
That may (or may not) be an option here.
More information about the conspire
mailing list