• wonderingwanderer@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 hours ago

    That makes sense. I see the problem with that, and I don’t have a good solution for it. It is a divergence of topic though, as we were discussing open-source programmers using LLMs which are potentially trained on closed-source code.

    LLMs trained on open-source code is worth its own discussion, but I don’t see how it fits in this thread. The post isn’t about closed-source programmers using LLMs.

    Besides, closed-source code developers could’ve been stealing open-source code all along. They don’t really need AI to do that.

    Still, training LLMs on open-source code is a questionable practice for that reason, particularly when it comes to training commercial models on GPL code. But it’s probably hard to prove what code was used in their datasets, since it’s closed-source.