Gentoo, the venerable Linux distribution which in my headcanon I describe as ‘classy’, has banned any use of “AI”. A proposal by Gentoo Council member Michał Górny from February of this year banning its use has been unanimously accepted by the Gentoo Council. The new policy reads:
It is expressly forbidden to contribute to Gentoo any content that has been created with the assistance of Natural Language Processing artificial intelligence tools. This motion can be revisited, should a case been made over such a tool that does not pose copyright, ethical and quality concerns.
↫ Michał Górny
We’ll have to see how this policy will be implemented, but I like that Gentoo is willing to take a stand.
I love Gentoo and have used it for a very long time until I got old and switched to Arch.
Further it is good to start a debate and I share their concern about the large amount of “false positives” — a lot of simply wrong suggestions are produced by those tools.
However, I do not like this ruling and find it very weak regarding:
1) how is a crappy AI contribution worse than a crappy “copied from stack overflow” contribution or a very well crafted hostile code contribution — all contributions should be audited carefully under the assumption that it may start WW3 or worse, kills your kittens
2) what are ethical concerns, who is entitled to judge those? (Of course, Gentoo makers can decide whatever they want on their own discretion, it’s their property and they owe nobody anything — but please don’t wrap it into ambiguous rules. Ambiguity is a tool of dictators and regimes.)
3) don’t make rules, you can’t enforce. you just ridicule yourself. How will they ever want to reject anything based on this rule? they can’t proof or sanction it.
The only merit I can see is plausible denial of possible copyright infringement if this ever becomes a topic.
Don Quixote much.
1) Crappy copied from stack overflow is generally better than whatever LLM based AI derives from it (often, almost exactly what you might find on SO – but also, less commonly understood – LLM AI tools EXCLUSIVELY produce derivative material anyway, they can’t write anything novel.) The SO content is at least vetted (that’s what SO does), and there is often a pretty decent explanation of the contents, so you are more likely to derive a useful solution from a SO article. Honestly, the tools aren’t even comparable.
2) You are just going to have to go read more about ethics for that one. Ethical concerns are everywhere, and everyone has both the ability and the responsibility to judge. Democracy is built around these ideas. Your question just seems anti-authority, not even anti-authoritarian, which comes off as pretty silly, especially applied to AI. There’s not just much more say here.
3) Why does everyone assume this can’t be enforced, or that it is only about creating a rule specifically to be “enforced” (again, choosing that authority framing, so you can argue against it – more silliness.) Sometimes rules are meant to create a statement on values, or to indemnify the group who wrote the rule against violators, intentional or unintentional, should they later come to light. Both of those have value, even if you can’t enforce the rule – though, yeah, it can be enforced. The idea it can’t be enforced is as ridiculous as claiming any other kind of copyright can’t be enforced. It can.
> which comes off as pretty silly
You are the one trying to put the paste back into the tube. I only chuckled about rules with a half-life time shorter than reading it take to read them.
> Why does everyone assume this can’t be enforced
Ever wondered, why there are so many wrong-way drivers incoming?!
Not only that, but they seem to be completely oblivious to what they are actually banning:
NLP includes everything from documents generation to machine translation, grammar correction, predictive autocorrect or even spam classification. They are basically banning all algorithms that work on natural texts.
And…
Programming is generally not considered part of NLP, even though Large Language Models can be used to generate code. So technically, they leave out AI coders like CoPilot, or Amazon’s CodeWhisper.
Basically luddites at work.
Completely unenforceable, and burying your head in the sand will not stop the AI “revolution” in programming, if it is indeed coming.
Hi, hust to let you know that the url and the RSS title says “bands use” instead of “bans use”
Regarding baning AI, I don’t really think it’s possible. The most important thing is distinguish the completely different tech that is boxed together as “AI”. Machine learning used for pattern recognition, such as decyphering lost languages or identifying types of cancer are really good uses of this tech.
The main problem that must be regulated is the Massive Plagiarism Machine that is “generative AI”. This is what we need to tackle
I don’t share the assumption that it can’t be enforced, but I agree there are positive uses for this type of tech (leaving aside the ethics of “training” methods). Here are a few in addition to what you referenced, and some that I think are less appropriate:
– Translation (sorry Thom!) – in many case, I think it can do an okay job. I actually don’t think this applies to artistic “localization” – if you want something better than average, you should hire a localizer.
– For TV shows and movies, and maybe anime voice overs – a vocal synthesizer, which can match vocal inflections, and performance aspects from the original performance. This will suck for voice over artists – but I think this could be a cool use for AI. There’s a recent human based version of this in a show called Shogun, where many of the Japanese actors who speak English, did their own voice over work in English – and they were able to more closely match their original performance. It’s really quite lovely.
– Secondarily, some kind of deep-fakish adaptation to match the on screen lips to the newly translated words. This actually solves multiple problems – it makes the lips match, but it can also remove a challenging constraint for voice localizations – squeezing an entirely different language in to the lip movements. A lot is lost in that effort.
To me, these could be solid uses for this type of technology. All of these are limited to very derivative art – that is, we aren’t asking this tech to create anything new. This is great, because this generation of generative AI (LLMs) really sucks at creating anything new, so this type of thing leans in to its strengths at creating derivative art. It’s really only able to ever produce something very very similar to what is in it’s training set. It rarely can even solve simple novel problems if there isn’t an exact solution for that specific problem in it’s vector set. Most haven’t figured this particular limitation out – some are even trying to build businesses on this tech, without fully understanding that limit. It’s pretty nuts out there.
Medicine is another great area to apply this tech, because you usually don’t want a novel solution to a medical problem. Usually, you want the tech to stick with what’s in the written materials. Doctors are trained to limit their diagnosis in this way.
Those 3 concerns are the emerging dive in to the trough of disillusionment – paraphrasing, “[NLP AI tools] pose copyright, ethical and quality concerns” – that’s exactly right! All three.