-
-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0185] Redistribute redistributable software #185
base: master
Are you sure you want to change the base?
Conversation
Co-authored-by: Matt McHenry <[email protected]>
I think a drawback for the plan as-is would be that some of the stuff is actually pretty large so space usage at Hydra/cache goes up. A missing related piece of work here is to look at the reverse dependencies of unfree packages and to check if they are a good idea to build. For normal packages we have some «it's large but downloading is longer than the build» packages marked as |
87cb264
to
27241c3
Compare
Good points! I just added them to the RFC. Also I'll incidentally say that TPTP seems to already be marked as |
I think that given the skew of unfree stuff towards large binaries, significantly increasing the evaluation time and somewhat increasing the storage growth are separate points to mention. (Probably we need to ask for feedback from the infrastructure team on all that at some point) |
@7c6f434c Why would binary-based packages significantly increase the evaluation time? Nixpkgs requires packages to pass strict evaluation, which means that downloading would never occur during evaluation. I haven't experiment it, but I guess that packages that requires a long evaluation time typically falls into the categories below:
none of the above are specific to unfree or binary-based packages. Update: Some binary-based packages might be built with legacy versions of libraries, which would require custom overriding if such version is uncommon in Nixpkgs. Still, such situation also occurs to large packages like TensorFlow, and small projects with few dependencies wouldn't take too long to evaluate. |
Evaluation time will increase because life is hard. Basically, even though |
I think I already listed the two points you're mentioning? The evaluation time issue was listed in the settings from the start; and I just added the built size issue as an unresolved question, as it's currently unclear whether it's negligible or not |
Also you seem to be hinting at a doubling of the eval time but I don't think that'd be the case. Hydra would eval a pkgset that'd consist of essentially |
We could do the Hydra's eval simply as now but with unfree allowed – and do this check separately in a channel-blocking job (executed on a builder instead of centralized with the eval). We have similar checks already in the CI might check this as well, but such regressions seem quite unlikely to me. |
Oh right, evaluating all the ISOs is not negligible, but indeed can be pushed to a build |
Thank you so much for working on this. Since MongoDB is the biggest offender which causes many people serious trouble day-to-day to build, perhaps we could also consider a phased rollout plan where that is the first thing to be included 😄 |
And MongoDB indeed has license which avoids most general concerns, in the sense that the source is available, and both arbitrary patches (as they are derivative, they are presumed same-license-as-MongoDB in Nixpkgs anyway) and running inside a network-isolated sandbox are permitted without restriction. This is not true for all unfree-redistributable things… |
The way I understand (and mean) the current RFC text, all currently unfree redistributable packages would stay out of hydra until marked Are there any remaining concerns on the current RFC, that I could address? :) |
@NixOS/infra-build just so that all of you see it… |
No objections to this RFC |
According to [this discussion](https://github.com/NixOS/nixpkgs/issues/83433), the current statu quo dates back to the 20.03 release meeting. | ||
More than four years have passed, and it is likely worth rekindling this discussion, especially now that we actually have a Steering Committee. | ||
|
||
Recent exchanges have been happening in [this issue](https://github.com/NixOS/nixpkgs/issues/83884). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context, we also started building all the redistribuable+unfree packages in the nix-community sister project.
See all the unfree-redis* jobsets here: https://hydra.nix-community.org/project/nixpkgs
It's only ~400 packages. The builds are available at https://nix-community.cachix.org/
The jobset is defined in nixpkgs to make upstreaming easier:
https://github.com/NixOS/nixpkgs/blob/master/pkgs/top-level/release-unfree-redistributable.nix
If this RFC passes it will be even better as users don't necessarily know about or want to trust a secondary cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's great to know, thank you! Though we may need to do a bit more to properly handle the "cannot be run on hydra" point that was raised above.
I can already see on the hydra link you sent that eval takes <1min, so should be a negligible addition to hydra's current eval times. Build times seem to take ~half a day. AFAIU there's a single machine running the jobs. If I read correctly, hydra currently has ~5 builders, and one trunk-combined build takes ~1 day. So it means that the build times would increase by at most ~10%, and probably less considering that there is probably duplication between what the nix-community hydra builds and what nixos' hydra is already building. I'm also not taking into account machine performance, which is probably stronger on nixos' hydra than nix-community's hydra.
I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.
There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra? I don't know how to get it on nixos' hydra either but it'd still help confirm there's zero risk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this means eval/build times are things we can reasonably live with, and if we get any surprise we can always rollback.
Yes, especially since the way the unfree-redis jobset is put together is by evaluating and filtering trough all the nixpkgs derivations. So most likely the combined eval time is much smaller than the addition of both.
There's just one thing I can't find in the links you sent to properly adjust the unresolved questions: do you know how large one build closure is on nix-community's hydra?
The best I can think of is to build a script that takes all the successful store paths, pulls them from the cache, runs nix path-info -s
on them and then sums up the value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your answer! I actually more or less found the answer from Hydra's UI. Here is my script:
curl https://hydra.nix-community.org/jobset/nixpkgs/cuda/channel/latest > hydra-jobs
cat hydra-jobs | grep '<td><a href="https://hydra.nix-community.org/build/' | cut -d '"' -f 2 > job-urls
for u in $(cat job-urls); curl "$u" 2>/dev/null | grep -A 1 'Output size' | tail -n 1 | cut -d '>' -f 2 >> job-sizes; wc -l < job-sizes | head -c -1; echo -n " / "; wc -l < job-urls; end
awk '{sum += $1} END {print sum}' job-sizes
# NVidia kernel packages take ~1.3GiB each and there are 334-164 = 170
# Total: 215G, so 45G without NVidia kernel packages
I got the following results:
- For
unfree-redist-full
, a total of 215G, including 200G for NVidia kernel packages and 15G for the rest of the software - For
cuda
, a total of 482G
Unfortunately I cannot run the same test on NixOS' hydra, considering that it disabled the channels API.
I just updated the RFC with these numbers, it might make sense to not build all of cuda on hydra at first, considering the literally hundreds of duplicated above-1G derivations :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So with the current Hydra workflows I'd estimate that very roughly as uploading 2 TB per month to S3. (we rebuild stuff) Except that we upload compressed NARs, so it would be less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly, that it'd be reasonable to do the following?
- Just push everything, and
- if compression is not good enough rollback CUDA & NVidia kernels; and
- even if we need to rollback, the added <1T would not be an issue to keep "forever"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. To me it doesn't even feel like a technical question. (3. is WIP so far, I think. There's no removal from cache.nixos.org yet.)
IANAL. This is not legal advice. My understanding is that the sections quoted by @emilazy such as "you may not distribute or sublicense the SDK as a stand-alone product" refer specifically to the License Agreement for "derivative works" and for the case of redistributing CUDA "as incorporated in object code format into a software application" (in other words - for applications / libraries that link against CUDA). Later on the same page, there is a "CUDA Toolkit Supplement to Software License Agreement" section (all emphasis mine):
and then, a bit later
This seems to directly conflict with the earlier "you may not distribute or sublicense the SDK as a stand-alone product" clause. So my best guess is that Section 1 of the EULA is intended to describe the licensing for "end users" of the SDK and Section 2 is intended for "system integrators" (Linux distributions, package indices, etc). It's possible that Either way, I think that NVIDIAs intention behind section 2.3 was to allow exactly what we want to do. So even if the current License doesn't allow |
Also, even if distributing |
I think our current binary cache needs to have complete closures, i.e. you can leave out some (transitive) runtime dependencies. Oh and I think the build farm isn't yet able to builds stuff without pushing them into the cache. |
Thanks! I apologize for missing that. (A good example of how difficult it can be to interpret licence text…) However, I wonder if it is really sufficient, even if we ignore the problem of modification (which seems risky to me, potentially riskier than your average basic Perhaps, as you said, 2.3 is what grants the permission for tools like I find “ELF is not an object code format” a legal theory that is incredibly unlikely to hold up, but who knows, I’m not a lawyer :) I agree that we could reach out to NVIDIA for a special exemption. I also agree that other projects have been relatively cowboy about this and have not yet been bitten by it. Still, if we’re going to deliberately violate EULAs because we think we can get away with it, I’d rather we do it with eyes open. (And I would personally be rather unhappy about it.)
Even ignoring Hydra/cache limitations, I don’t think this is true. The EULA explicitly grants only limited rights to modify the SDK, regardless of its redistribution requirements (“Except as expressly provided in this Agreement, you may not copy, sell, rent, sublicense, transfer, distribute, modify, or create derivative works of any portion of the SDK”). To build packages using CUDA, Hydra would have to modify the CUDA SDK in a manner that (by the premise of the suggestion) is prohibited. Even apart from that, we’d also either have to build CUDA separately on every builder that builds a CUDA‐using package, or establish that transferring that modified version around the project infrastructure doesn’t count as “redistribution” (this would be a pretty controversial theory, I think) and set up an entirely parallel secret CUDA cache.
*cannot, right? |
I think, I wasn't quite clear in my previous comment. I am not suggesting that we should just YOLO the NVIDIA license and hope that we don't get sued (although some other projects apparently have done so). My argument wasn't based on a legal interpretation of the text of the license, but on the (inferred) intentionality behind this text. It seems to me that the second part of this EULA was intended to allow downstream linux distributions / package indices / etc to redistribute CUDA. So even if the Foundation lawyers decide that the current wording of the EULA doesn't allow for redistribution, it should probably be possible to get NVIDIA to adjust the wording.
Going to assume you meant "can't leave out". Aren't binary caches just glorified file shares? I didn't test this, but I would imagine that
Yes, unfortunately that would probably require implementing extra functionality in Hydra. |
Hmmm... I was under the impression that you can do whatever you want with CUDA code/binaries as long as you don't distribute the results. But now that you mention it, you're probably right. But if modifying CUDA in such ways is illegal (even without redistribution), wouldn't that mean that Nixpkgs/NixOS users that set |
It may be fair use, and perhaps some jurisdictions even allow it in general? I think there are two complications in this case: one, that “copying” tends to be interpreted quite liberally in the context of computers, but two, that more importantly, in this case, the NixOS Foundation would be explicitly agreeing to an EULA contract that forbids them from doing so.
Yeah, possibly. And I think that it’s not totally without legal risk to distribute scripts that let people do that. But it’s at least much less risky. The Foundation itself never agreed to the EULA and isn’t distributing anything directly covered by it. |
Just to be sure I didn't miss anything: outside of the discussion of whether specific licenses can actually be redistributable or not (which we should have again if the RFC lands, in one separate thread per such license/package), is there any change I should make to this RFC? The only thing I did find against the RFC as-is is "it's too much effort if no package ends up benefitting from it"; but it's actually just adding one flag so we've probably already spent more effort here than implementing the RFC will ever be. This being said, I did skip over the license-specific discussion (because I'm not focusing on any specific project yet and we will have to discuss this again anyway), so I may have missed another point within these parts. |
At a minimum, I think the RFC should not list examples of packages that would be offered if our ability to safely legally build and redistribute them is questionable. I would remove at least CUDA and TeamSpeak, and likely MongoDB and UnRAR as well. I do not think that implementation is just a matter of adding one flag, because I think the compliance burden to mitigate increased legal risk will be meaningful and require resources from the Foundation. I think that the RFC listing almost no drawbacks and describing the change as “basically risk-free” is a misleading picture in light of the concerns I’ve raised and the fact that discussion of the legality of distributing even some of the listed examples has ended up with “maybe the Foundation could negotiate with NVIDIA for licence changes”. That compliance and review burden trades off against the software that would become more conveniently available to users; if the most desired non‐free software still wouldn’t be viable for us to distribute, then the cost‐benefit becomes a lot worse. So I do think that this is relevant to the question of whether the RFC should be accepted. I do strongly think a per‐package rather than per‐licence flag would be better, given how common complicated use restrictions are for non‐free software; unlike with FOSS, two non‐free packages under the same licence are much less likely to be able to be treated interchangeably. Especially for licences like the Business Source License where the compliance picture looks significantly different based on the Additional Use Grant – though arguably we should be encoding all of those as separate variants to begin with. Beyond that, I do personally disagree with the aim of the RFC, as described in the second paragraph of #185 (comment). But I accept that it’s unlikely that we will be able to come to a consensus on the ideological matters here in the RFC discussion format, so I have focused on the problems and limitations of implementing the core idea. However, I am still concerned about the second‐to‐last bullet point of #185 (comment). I believe that non‐free software is likely to unnecessarily leak into the closure of free packages even if the installers are treated specially. |
The RFC explicitly mentions multiple times that the examples are illustrative only. Literally all closed-source packages are going to be questionable until we spend all the effort to validate what we can and should do; and there's no value in having this discussion until this RFC lands. So I disagree about this: there could be no example otherwise, and I think the RFC is already explicit enough that the examples are in no way guaranteed to ever be approved.
The risk will come when we change packages to actually start using this flag, hence my describing this RFC as risk-free. This being said, I'll add a paragraph in the drawbacks hopefully this evening, to say that each new approved use of this flag will carry risk and should be reviewed properly. So the drawback will mention an increase in legal review requests to the foundation.
This is exactly how I'm seeing us moving forward when we start to think of using the new flags in licenses: either the unfree license is definitely fine and we can whitelist the license, or we'll need to define a new license type for each package we want to start building on hydra. Which also semantically makes more sense to me than just dumping everything in the "unfree" dumpster, as there's even more differences between proprietary licenses than between FOSS licenses.
That's a good point, thank you! I'm trying to think, and it'd likely make sense to me, to run an eval job that verifies our licenses are semantically valid: eg. no MIT package depends on a GPL package without the codegen exception, etc. This being said, such an effort is likely beyond the scope of this RFC. When I get to it this evening, I'll write down that as a drawback, and add the eval job as future work! |
I am not convinced by this argument. We are also supporting macOS, which requires a lot more proprietary components (macOS SDK) and is harder to test. It by the way also doesn't conveniently ship a meta.license attribute in nixpkgs because it would be "unfree". This just enables a few more packages that can be tested on otherwise free operating systems.
The "Modification" needs to be interpreted in the legal sense. This means, are we creating a derivative work of CUDA. Patchelf is not creating a derived work, it's an automated patching process that is not even specific to CUDA of parts of the program that are not even the copywrite-protectable part of CUDA. Elf header existed way before CUDA and CUDA did not invented anything new here. If you would interpret this in the technical sense, you wouldn't be allowed to load the binary into memory because it is creating a modified copy of the original ELF file (i.e. patching up references in memory).
The license you linked says, you cannot use the UnRaR code to re-create the RAR compression algorithm. This is different from you said.
We publish the NixOS tests over the internet, this is enough for a user to reproduce the source code. And no, a NixOS test is not a service for third-parties, it runs in a nix build and cannot access by other parties this way.
NixOS is not doing this, so why would this be a problem?
See the point about CUDA about.
Since we are often re-packaging proprietary packages, this less of a problem compared to opensource licensed projects, since the license would be than already part of the original distribution. But even if we get this part wrong, it's unlikely to severe consequences. Usually you will first be contacted by the other party to fix the packaging before legal actions will be invoked.
We have the unfree check that will still fail to evaluate. This wouldn't be changed by this RFC.
|
Well, I do recommend you read the full discussion as the RFC author, since I tried to outline all the concerns I have, and even the discussions about specific software are illustrative of the general problems. I realize that the listed packages are just examples, but I also think that a large part of the desire for this is driven by a relatively small number of non‐free packages that are expensive to build – after all, the drawback of the status quo is described as “very long builds for lots of software”, though in practice I almost exclusively hear people talk about CUDA and MongoDB, and the motivation section directly talks about MongoDB – and those packages happen to be ones that are questionably legal for us to distribute. And clearly the RFC is meant as a referendum on the idea of trying to actually distribute meaningful amounts of non‐free software in practice, so I think that “it’s just adding a flag, which is zero‐risk, because all the problems only come the first time someone tries to actually use the flag” is misleading. MIT software depending on GPL is not a licence problem and is pretty common. Free software depending on proprietary software isn’t uncommon, either; it’s only a problem when it unnecessarily does so. The RFC certainly makes that more likely to happen. |
I read the full discussion and the RFC before I applied for shepherding. Thanks
If this was true, which it isn't it (the license is mainly restricts SaaS as opposed to distribution), mongodb would be marked as non-redistributable.
Not convinced this is more likely, companies often have better license checks in place to stop this from happening. Opensource projects often don't care much about this because. And the result if a proprietary project would depend on GPL would be just that the project is also under GPL... This is not really our problem but the problem of the project. |
What we use from the macOS SDK is headers and I agree that before Apple adopted the
Given that this clause says “provided that the object code files are not modified in any way (except for unzipping of compressed files)”, implying that unzipping compressed files counts as a modification, do you think NVIDIA’s lawyers would agree with this stance? I agree that it would be sensible for copyright law to allow
I’m not sure how that’s different from what I said? Any downstream user of the UnRAR code that is in violation of that clause would then subject us to the onerous termination clause were we to build and distribute it. Packaging one piece of software that uses the UnRAR code in a prohibited way would revoke our licence to distribute UnRAR at all.
An arbitrary third‐party could send pull requests to NixOS to the MongoDB test, which would result in us exercising MongoDB’s functionality as requested through ofborg and then Hydra. The question is whether this counts as “mak[ing] the functionality of the Program or a modified version available to third parties as a service”. I agree that this is far from certain. But I do not think it is certain to be false either. The SSPL is not designed to be unambiguous or easy to comply with if you do anything that could remotely count as that. And this isn’t the AGPL, where you just have to publish the code for the actual thing you run. See the quoted portion of the SSPL, which clearly states “all programs that you use to make the Program or modified version available as a service, including, without limitation, management software, user interfaces, application program interfaces, automation software, monitoring software, backup software, storage software and hosting software”. It was explicitly designed so that Amazon could not offer MongoDB as SaaS without open sourcing the entirety of AWS. The scope is essentially unlimited, and if the clause applies, would in fact include all the software we use to manage our infrastructure. We can’t apply FOSS norms where licences are intended to be easy to comply with and participants are generally sensible and acting in good faith with the desire to have their code redistributed. The SSPL was designed as a legal stick to stop organizations that were doing things that MongoDB Inc. didn’t like. Non‐free licences are far more often deliberately adversarial and backed by expensive lawyers, and we have to take this into account when discussing compliance and risk.
See above.
I agree that in practice the risk is not completely unlimited. However it is far higher than with FOSS.
It would; the proposal is explicitly to allow certain packages to disable the check on Hydra:
If a user with non‐free software enabled makes a PR and nobody with it disabled reviews it, then there will be no early warning for non‐free software in a free package’s closure. CI, based on the settings of the release jobset, is our current backstop against this happening. That’s why it explicitly carves out the release ISOs; I’m saying that the problem extends beyond those. |
My comment was directed at @Ekleog as the RFC author, who said:
I did not see your comment before posting #185 (comment) and have replied to it in #185 (comment). |
IANAL, but IIRC the reason that you are allowed to do this is that unlike modifying ELF files with patchelf, this specific action is permitted by copyright law. I found https://digital-law-online.info/lpdi1.0/treatise20.html — not sure if it's a trustworthy resource but matches what I remember having read about this before. |
It seems like this states that loading into RAM is likely a copy under US law, right? if that is the case, I presume that it is permitted because EULAs grant the right to run programs (explicitly or implicitly), and the copying and in‐memory modification involved in doing so is an unavoidable part of running a program. Perhaps that extends to modifying the on‐disk ELF files such that they can be run, although it still seems to me that the picture gets very lawyer‐requiringly murky once we bring redistribution into the picture. |
I will definitely go over it once again once I'm back to actually push this RFC forward this evening. This being said, I also regard the discussion about specific derivations as being off-topic for this thread, as is explicitly mentioned in the RFC text. So please forgive me for not reading in-depth the long and off-topic discussion, that I specifically tried to prevent by writing the RFC in a way as defensive as possible, because I knew otherwise the discussion would derail into each specific packages and legal discussions are not one technical people are good at having. This being said, this discussion did raise a very interesting point in @Mic92 's answer: we currently have some packages that should be marked as unfree and are not because of technical limitations. Which is probably more dangerous than an unfree package sneaking its way into a free package's closure and being detected only by people with unfree disabled. So I'll add that in the RFC too, and will read the rest of the discussion without implicating myself too much, as I'm noticing these few messages from today are already too much. Anyway, have a good afternoon and weekend! |
I just committed:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I’ve tried my best to give a thorough review. Sorry for not finding the time to do this months ago.
This means that unfree redistributable software needs to be rebuilt by all the users. | ||
For example, using MongoDB on a Raspberry Pi 4 (aarch64, which otherwise has access to hydra's cache) takes literally days and huge amounts of swap. | ||
|
||
Hydra could provide builds for unfree redistributable software, at minimal added costs. | ||
This would make life much better for users of such software. | ||
Especially when the software is still source-available even without being free software, like MongoDB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intent isn’t to commit to providing any specific package then I think using a concrete example of MongoDB in the motivation section is misleading, as accepting this RFC does not necessarily mean that this motivation will be addressed.
# Detailed design | ||
[design]: #detailed-design | ||
|
||
We will add a `runnableOnHydra` field on all licenses, that will be initially set to its `free` field, and set to `true` only for well-known licenses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel that doing this per‐package makes more sense given that I think we would want oversight to be on a package‐by‐package basis.
- `iso_gnome` | ||
- `iso_minimal` | ||
- `iso_minimal_new_kernel` | ||
- `iso_minimal_new_kernel_no_zfs` | ||
- `iso_plasma5` | ||
- `iso_plasma6` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The graphical ISOS were unified, and Plasma 5 is gone. But I don’t know if it’s worth listing them explicitly anyway.
- `sd_image_new_kernel` | ||
- `sd_image_new_kernel_no_zfs` | ||
|
||
This RFC offers absolutely no more guarantees than the current statu quo, as to whether proprietary packages will or not build on hydra. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: status quo; Hydra
|
||
This RFC offers absolutely no more guarantees than the current statu quo, as to whether proprietary packages will or not build on hydra. | ||
In particular, proprietary packages will not necessarily be part of the Zero Hydra Failures project upon release, | ||
though release managers could, at their own discretion, decide to include some specific proprietary packages in there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of packages do you envision being chosen here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't actually envision any — this being said, I'm not a release manager and definitely don't have enough time to do that work, so I don't know all of their constraints.
I just don't want to have this RFC formally ban release managers from doing whatever they feel is best suited to doing a good release 😄
Do we need a specific `redistributableWhenPatched` field on the license? | ||
It feels like this would be a bit too much, and probably `redistributable` would be enough. | ||
However, we may need to have it still. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We surely can’t legally patch the majority of unfreeRedistributable
software. We would have to hope that we have legal advice that patchelf
doesn’t count and that we don’t need to do more than that for anything we want to redistribute.
- **Actually tagging licenses and packages as `runnableOnHydra`.** | ||
Without this, this RFC would have no impact. | ||
This will be done package-by-package, and should require no RFC, unless there are significant disagreements on whether a license should be runnable on hydra or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think such licence review would require legal advice from the Foundation more than another RFC.
But if we’re doing things package‐by‐package then this should presumably be a meta
field on packages as I suggested, rather than part of the licence.
This could be automatically computed by evaluating the Nix scripts. | ||
In particular, we could have a specific `enforceFree` meta argument that'd enforce that this derivation as well as all dependencies are transitively free. | ||
Implementing this may be doable in pure nix, or could require an additional hydra check. | ||
This is left as future work, because even without validating licenses this RFC probably reduces the risk for FOSS users from installing proprietary software. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really do not see how it reduces that risk at all and think this line should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the MacOS discussion above.
After discussion in the infrastructure room on Matrix: according to the State of the Union presentation from March, our current cache size growth rate is around 6.56 TiB/month. @vcunat’s rough estimate in #185 (comment) was that the numbers quoted here would result in an increase of around 2 TiB/month uncompressed. That means that, depending on compression ratio, we’d be looking at around 15–30% of the current total cache size growth rate added on top of the current number. I think that would more than wipe out the work that has been done to make the cache growth more sustainable after the funding crisis. That really needs to be listed as a drawback of this RFC; I doubt there has been any one change in the project that has made a change to cache size growth rate that substantial and I think we are only currently managing to keep the cache operating because of an AWS sponsorship (that I remember hearing is conditional on doing work to mitigate the growth? but I’m not sure about the details there). Per #185 (comment), it does seem like by far the majority of that would be NVIDIA drivers and CUDA. So if those were excluded then the impact probably wouldn’t be a big deal, but of course those are a lot of what people would like from this anyway. |
shepherd-team: (names, to be nominated and accepted by RFC steering committee) | ||
shepherd-leader: (name to be appointed by RFC steering committee) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shepherd-team: (names, to be nominated and accepted by RFC steering committee) | |
shepherd-leader: (name to be appointed by RFC steering committee) | |
shepherd-team: @Mic92, @roberth, @Lassulus | |
shepherd-leader: @Mic92 |
RFCSC: We are appointing @Mic92 (lead), @roberth, and @Lassulus as shepherds. @emilazy seems very engaged as well, we encourage her to nominate as shepherd as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ekleog Can you accept this revision?
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/rfcsc-meeting-2025-03-31/62432/1 |
@emilazy Unless I'm the one misunderstanding, I think you're misunderstanding the 700G figure. AFAIU this number is actually the full closure size of the unfree binary caches. This means that it also includes FOSS that is already being cached. So it's most likely a very large overestimate of the actual cache size increase we'll be seeing. This being said, I thought I had already listed it in the drawback section. Thank you for the reminder, I'll do that next time I amend this RFC, along with checking your other comments. I'll explicitly write that we plan on checking the cache size evolutions and rollback any package that'd consume too much cache space. |
So should we get started on a first meeting? |
Rendered