Ask HN: Any experience using LLMs to license-wash FOSS projects?
Can LLMs like Gemini, ChatGPT or Claude be used to generate an equivalent FOSS project but removed from its licence and authorship?
Would this be legal?
For example, if a SaaS corporation wanted to modify and sell a service using some AGPL project can they use AI to entirely rewrite that project effectively detaching it from its original creator and ownership?
it depends what you mean.
If you ask it "give me a library same as X" by name and it does it then this will surely be based on the code of the library and may even contain the actual code of the library. Don't do this.
If you feed it the code of the library piece by piece and ask for something "equivalent" that's even worse. Explicitly derivative.
If you write your own documentation of how the library works but don't mention it by name and it's not a very special purpose library and the LLM writes to your new spec... Then probably you'll spend a lot to get a worse version. IANAL
Probably would be simpler to pay for an alternative license. Most companies writing code under FOSS licenses have paid dual-licencing schemes.
Removing the licence and/or authors from a FOSS project would generally be a violation of the licensing terms. The tool(s) you use don't change the legal principles.
Of course, the big AI companies blithely ignore moral and legal issues.
> Can LLMs like Gemini, ChatGPT or Claude be used to generate an equivalent FOSS project but removed from its licence and authorship?
No.
The whole project (and some may argue that the LLM that trained on the AGPL code that is also running on the backend), should be open sourced as well.
Using LLMs to remove the licence and generating a derived project from the original AGPL code is not a 'clean room implementation' and is the equivalent of rewriting existing code from the original author.
In a sane world I would have agreed but in the US at least I am not certain this is still true: In Bartz v. Anthropic, Judge Alsup expressed his views that the work of an LLM is equivalent to the one of a person, see around page 12 where he argues that human recalling things from memory and AI inference are effectively the same from a legal perspective
https://fingfx.thomsonreuters.com/gfx/legaldocs/jnvwbgqlzpw/...
To me this makes the clean-room distinction very hard to assert, what am I missing?
If a human reads the code and then writes an implementation this is not clean room and the LLM would in most cases be equivalent to that.
Clean room requires the person writing the implementation do have no special knowledge of the original implementation.
Could you share a source for this definition? As far as I know it means no having access to the code only during the implementation of the new project