Protect your original content from being used for AI learning.
I read recently this tweet from Pieter Levels:
Added
— @levelsio (@levelsio) July 24, 2023
<meta name=”robots” content=”noai, noimageai”>
to Photo AI so I don't dilute future AI data sets with AI generated images#responsibleGuy pic.twitter.com/3wGdllRzEU
I’ve been digging a bit, I discovered that while it’s not a standard practice, many entities have started implementing it. It was introduced by DevianArt in 2022, and many others have followed it.
To opt out your content, add the following meta tags to the head section of your HTML:
<meta name="robots" content="noai, noimageai" />
You can also include the following headers in your server’s response:
X-Robots-Tag: noai
X-Robots-Tag: noimageai
If you want to forbid GPT crawlers specifically, add the following meta tag:
<meta name="CCBot" content="nofollow" />
By implementing these measures, you can take control over how your original content is used in AI training.