On Sep 16, 2025, at 8:55 PM, Wayne S via cctalk
<cctalk(a)classiccmp.org> wrote:
They do not observe robots .txt
Sent from my iPhone
> On Sep 16, 2025, at 17:53, Wayne S <wayne.sudol(a)hotmail.com> wrote:
>
> I did notice the scraping.
> I toyed with the idea of putting ludicrous text files up that a normal user would not
see and see which bot got them.
>
> Sent from my iPhone
>
>> On Sep 16, 2025, at 17:02, Bill Degnan via cctalk <cctalk(a)classiccmp.org>
wrote:
>>
>> For those of you who run vintage computing-related info sites, have you
>> noticed all of the LLM scraper activity? AI services are using the LLM
>> scrapers to populate their knowledge bases.
>>
>> At any given moment 5-10 of them are active on
vintagecomputer.net. It’s
>> funny, when I ask an AI about something vintage computing-related,
>> something obscure, I can trick into giving me an answer from my own site.
>>
>> I have actually had to modify the site code to manage the traffic, to
>> improve efficiency.
>>
>> But they’re not going after just my site, these scrapers are absorbing
>> copies of the entire WWW.
>>
>> I wonder how long the WWW will remain open, it would be a bummer if I found
>> copies of my site elsewhere.
>>
>> Bill