Skip to main content


Why GitHub #Copilot doesnโ€™t violate free software licences

๐Ÿ”— https://forgoodeyesonly.codeberg.page/blog/2022/07/why-github-copilot-doesnt-violate-free-licences

#GitHubโ€™s new code completion is an #AI trained using #FreeSoftware. Many see this as a copyright infringement of #copyleft licences, but this is dangerous half-knowledge. Read here why this is the case and why stricter #copyright law wonโ€™t get us anywhere.

#foss #license #licence #microsoft
Preview image for the blog article. Background: a code sample generated by GitHub Copilot, taken from its official website.
This entry was edited (11 months ago)
TL;DR:
  • Scraping code simply isn't a #copyright infringement.
  • #Copilot outputs are no derivative works.
  • As an artificial machine, Copilot is not an author in the meaning of copyright.
  • #GitHub doesnโ€™t even claim copyright in the outputs.
  • The outputs donโ€™t reach the necessary level of creation to be copyright-protected.
  • The AI's complexity is irrelevant for the protection of the outputs.
  • GitHub's terms of use override the repo licences.
While #GitHub #Copilot doesn't violate free #software licences (see โฌ†๏ธ), there are plenty of reasons to #GiveUpGithub anyway. Here are a few:

1. Since we're #FOSS developers, our tools should be #FreeSoftware too.
2. Monopolies are never a good idea.
3. By using a walled garden, we're excluding potential contributors.
4. By using #Microsoft products, we're supporting a producer of #tracking malware and an #NSA collaborator.

Instead, we should switch to #Git platforms running @gitea, such as @codeberg. Also, @dachary and others are already working hard on #forge federation in the @forgefriends project.

Read more on the #ForGoodEyesOnly blog: https://forgoodeyesonly.codeberg.page/blog/2022/07/why-github-copilot-doesnt-violate-free-licences/#why-we-should-still-giveupgithub
I'm pretty sure it violates a lot if not all foss licenses because it strips the license file from the code it distributes, modified or not.

Whether or not Github's ToS override license files the author added to the code is an interesting point. If they indeed do then the code authors are at fault for not catching that when creating their projects there and trusting their license agreement would be honored by both users as well as Github themselves.
The (re-)distributed excerpts are too small to be copyright-protected, so there's no need for keeping the licence files.

Yes, developers should be very careful about creating forks of copyleft-licensed on third-party platforms. In contrast, permissive licences don't introduce such problems in the first place.
is that conjecture or a verifiable fact?

Unless there's a court decision (which I doubt there is) I'd say for example GPLv3 5a) and 5b) apply to anything non-trivial derived from Foss code taken from any repo.

The fact that they don't even attempt to keep track of which repo it was taken from and what license applied at the time leaves an additional sour taste.

I'd like to see someone change the license post-mortem and sue them for damages.
Not sure what you mean by โ€œverifiable factโ€. Please note that court rulings, especially those from lower instances, don't necessarily create certainty, since (a) judges are independent, (b) cases are different and (c) they only apply to one specific jurisdiction.

If the โ€œderivedโ€ work doesn't contain an actual excerpt from the original code, in Germany this usage falls under ยง 44b UrhG which allows โ€œtext and data miningโ€; meaning that creating (non-trivial) works based on the *analysis* of copyright-protected code is not a copyright infringement, so there's no need to comply with the GPL in this case.

If the derived work does, in fact, contain actual excerpts from the original code, then it depends on whether those excerpts themselves reach a level of artistic creation that is high enough to fulfill the requirements of ยง 69a UrhG for copyright protection as a computer program: https://forgoodeyesonly.codeberg.page/blog/2022/07/why-github-copilot-doesnt-violate-free-licences/#copilot-outputs-dont-reach-the-necessary-level-of-creation
yeah the latter case is the interesting one. If you can't trace a code snippet to something on github it doesn't really matter.

What I'm wondering (and IANAL which is why I'm asking about precedence) is whether this actually even falls under copyright law. Since we're not talking about copyright protection but what I would call legitimate use within the license established by the code author.

But I really don't know. All I know is don't trust microsoft, ever.
I really hope forge federation will soon be ready so that large FOSS projects won't be able to argue anymore that they โ€œneed GitHub because all the contributors are thereโ€.
even if it is my guess is github is staying out of it because business model.
@Pixelcode Apps :verified:โ€‹ Here's the problem with number 7:

In order to supercede the GPL, you need the consent of every single contributor to the project.
Yes, that's why I argue it's a copyright infringement to upload a fork of a copyleft-licensed project to a third-party service: https://forgoodeyesonly.codeberg.page/blog/2022/07/why-github-copilot-doesnt-violate-free-licences/#copyleft-promotes-monopoly

This website uses cookies. If you continue browsing this website, you agree to the usage of cookies.

โ‡ง