A few weeks ago, I saw a flurry of conversation about how you can now disallow OpenAI from indexing your personal website using
User-agent: GPTBot Disallow: /
That felt a bit “ex post facto“ as they say. Or, as Jeremy put it, “Now that the horse has bolted—and ransacked the web—you can shut the barn door.”
But I never got around to it.
Tangentially, Manuel asked: what if you updated your
robots.txt and blocked all bots? What would happen? Well, he did it and after a week he followed up. His conclusion?
the vast majority of automated tools out there just don't give a fuck about what you put in your robots.txt
That’s when I realized why I hadn’t yet added any rules to my
robots.txt: I have zero faith in it.
Perhaps that faith is not totally based in reality, but this is what I imagine a
robots.txt file doing for my website: