

We’re already at that point. It’s already impossible to tell if any given function or class was written by AI. You can only tell when the code is large; where the architecture reached the point where AI couldn’t keep it all inside the context window.
Eventually the AI will write better code than humans and that’s when policies like this will seem very short sighted.



















We learned this lesson in the 90s: If you put something on the (public) Internet, assume it will be scraped (and copied and used in various ways without your consent). If you don’t want that, don’t put it on the Internet.
There’s all sorts of clever things you can do to prevent scraping but none of them are 100% effective and all have negative tradeoffs.
For reference, the big AI players aren’t scraping the Internet to train their LLMs anymore. That creates too many problems, not the least of which is making yourself vulnerable to poisoning. If an AI is scraping your content at this point it’s either amateurs or they’re just indexing it like Google would (or both) so the AI knows where to find it without having to rely on 3rd parties like Google.
Remember: Scraping the Internet is everyone’s right. Trying to stop it is futile and only benefits the biggest of the big search engines/companies.