Skip to content

Commit 494ee2c

Browse files
authored
Merge pull request #61 from hasufell/midstream-bindists
Add midstream bindist proposal
2 parents f59cd98 + ac3c4a8 commit 494ee2c

File tree

1 file changed

+281
-0
lines changed

1 file changed

+281
-0
lines changed

proposals/0000-midstream-bindists.md

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
# Midstream bindists
2+
3+
## Abstract
4+
5+
Bindists for the Haskell toolchain have been produced by upstream (the developers of each respective tool) for
6+
a long time and many tools rely on these "official" bindists (e.g. GHCup and stack).
7+
8+
We propose here that bindists are built and maintained by the GHCup project, which
9+
provides the main installation experiences in the Haskell ecosystem, removing the hard
10+
dependency on upstream bindists entirely.
11+
12+
## Background
13+
14+
Historically, installers like GHCup and stack have used upstream bindists for mainly one reason: it's easy to do
15+
so and doesn't require further efforts.
16+
17+
However, using upstream bindists directly is extremely rare in the Linux world of distribution. Most distributions
18+
build, package, test and curate binary packages themselves, not only because they have custom formats, but for
19+
reasons of control, trust and quality.
20+
21+
The relationship of GHCup and bindists has also been described in the blog post
22+
[GHCup is not an installer](https://hasufell.github.io/posts/2023-11-14-ghcup-is-not-an-installer.html) recently.
23+
24+
## Problem Statement
25+
26+
From the perspective of a GHCup developer, there are several issues with relying on upstream bindists.
27+
28+
### Platform support
29+
30+
GHC and other tools have in the past dropped support for certain platforms either entirely or requested
31+
the community to step up and do the work (e.g. on GHC CI).
32+
33+
E.g. [GHC ARMv7 support was dropped silently without any call for help](https://gitlab.haskell.org/ghc/ghc/-/issues/21177#note_470440). Similarly, FreeBSD support just ceased to exist when the GHC FreeBSD CI stopped working. Later the community asked for a [revival](https://gitlab.haskell.org/groups/ghc/-/epics/5), but nothing signifcant has happened so far.
34+
GHCup still produces bindists from time to time for FreeBSD, but e.g. the [HLS release manager for 2.5.0.0 recently refused to add FreeBSD bindists](https://github.com/haskell/ghcup-metadata/pull/159).
35+
36+
Similarly, stack used to have issues with Aarch64 bindists:
37+
38+
* https://github.com/commercialhaskell/stack/issues/5709
39+
* https://github.com/commercialhaskell/stack/issues/5854
40+
* https://github.com/commercialhaskell/stack/issues/5540
41+
* https://github.com/commercialhaskell/stack/issues/5610
42+
* https://github.com/commercialhaskell/stack/issues/5619
43+
* https://github.com/commercialhaskell/stack/issues/6141
44+
* https://github.com/commercialhaskell/stack/issues/6142
45+
46+
Recently, cabal-install had issues with i386 binaries and alpine, delaying a GHCup metadata PR:
47+
48+
* https://github.com/haskell/ghcup-metadata/pull/127#issuecomment-1766020410
49+
50+
These issues are frequent and so far the GHCup developers used to single handedly fix all those missing bindists manually
51+
and provide them here: https://downloads.haskell.org/~ghcup/unofficial-bindists/
52+
53+
This is unfunded and significant work.
54+
55+
### Bindist maintenance
56+
57+
Sometimes, bindists are broken, e.g. for GHC there are a couple of instances:
58+
59+
* 9.0.2 shipping without profiling info: https://gitlab.haskell.org/ghc/ghc/-/issues/21841
60+
* DESTDIR variable ignored by `make install`: https://gitlab.haskell.org/ghc/ghc/-/issues/19646
61+
62+
Sometimes, bindists have been built for very old version of linux distros and won't run well on newer linux versions.
63+
This is currently a problem since Debian has removed ncurses5:
64+
65+
* https://github.com/haskell/ghcup-hs/issues/902
66+
* https://answers.launchpad.net/ubuntu/+source/ncurses/+question/707838
67+
* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1025964
68+
69+
For cabal, there has been the infamous hLock issue:
70+
71+
* https://github.com/haskell/cabal/issues/7313
72+
* https://github.com/haskell/cabal/issues/7950
73+
74+
This shows that bindists, for current and historical versions, need continuous maintenance. However, upstream developers
75+
so far have very rarely engaged in this type of maintenance work, pushing it down to GHCup. As an example, here are all the
76+
manually patched and re-packaged bindists that fix the DESTDIR bug outlined above: https://downloads.haskell.org/ghcup/unofficial-bindists/ghc/curated/
77+
78+
This type of work also requires significant time.
79+
80+
### Quality gateway
81+
82+
There's something pleasing about upstream providing bindists: there's a perceived trust about it in the community,
83+
e.g. the [haskell.org committee expressed concerns about GHCup changing bindists in the past](https://github.com/haskell-infra/www.haskell.org/issues/212#issuecomment-1272312911).
84+
85+
However, rarely do upstream developers have signifcant experience in redistribution, nor do they have the time to focus
86+
on all the issues that come with it. Bindists are mostly provided "as-is" and support beyond what the release CI outputs is
87+
left to midstream (GHCup, stack, ...).
88+
89+
Conceptually, it is a good idea to separate concerns: upstream provides the sources and has tested it. Distributions build the binaries,
90+
validate that the program can be successfully built from source and make sure that the final artifacts pass the test suite.
91+
This is good, because we want to know whether end users can build e.g. a functioning GHC from source. If only the release CI outputs
92+
a GHC that passes the test suite, then something is fundamentally broken.
93+
94+
Distributors are often closer to the end-users and can provide additional support and efforts for the installation experience.
95+
96+
### Security backports
97+
98+
Cabal has recently found to be vulnerable to [HSEC-2023-0015](https://github.com/haskell/security-advisories/blob/main/advisories/hackage/cabal-install/HSEC-2023-0015.md).
99+
100+
This vulnerability [had not been communicated to the GHCup team](https://github.com/haskell/security-advisories/issues/129) prior to disclosure,
101+
causing high distress for a backport, since at the time of disclosure, cabal-3.6.2.0
102+
was still 'recommended' by GHCup, since [cabal-3.10.2.0 is still broken on windows](https://github.com/haskell/cabal/issues/9334).
103+
104+
As such, GHCup developers needed a quick and efficient way to:
105+
106+
* patch cabal
107+
* build release binaries for cabal
108+
* ship the binaries
109+
110+
This had been done as a downstream release `3.6.2.0-p1` roughly a week after the disclosure.
111+
112+
Meanwhile, cabal upstream still has not finished their backport due to issues
113+
with hackage dependencies: https://github.com/haskell/cabal/pull/9457
114+
115+
This shows that a certain amount of independences from upstream CI and upstream workflow
116+
is essential to fulfill swift security backports to potentially under-maintained
117+
branches/versions.
118+
119+
### GHC nightlies
120+
121+
As a special case, I want to point out that GHC nightlies have been frequently broken beyond repair:
122+
123+
* https://gitlab.haskell.org/ghc/ghcup-metadata/-/issues/2
124+
* https://gitlab.haskell.org/ghc/ghc/-/issues/24000
125+
126+
The breakage was left unattended and some bindists completely vanished from gitlab CI artifacts, because of misconfiguration.
127+
Here's a graph of nightlies availability: https://grafana.gitlab.haskell.org/d/ab109e66-a8a1-4ae9-b976-40e2dfe281ab/availabilitie-of-ghc-nightlies-via-ghcup?orgId=2
128+
129+
## Prior Art, Related Efforts and alternative solutions
130+
131+
So far, GHCup developers have tried to close the gap, doing signifcant work on upstream CIs and building bindists manually where
132+
necessary.
133+
134+
Bindists for cabal-install are now produced by GHCup's own CI: https://github.com/haskell/ghcup-metadata/blob/develop/.github/workflows/cabal-release.yaml
135+
136+
HLS will likely follow shortly.
137+
138+
As mentioned before, there have been attempts to improve the coordination and collaboration across the entire Haskell
139+
toolchain, view the tooling end-user experience in a holistic way and make decisions based on that end-user experience: https://github.com/haskellfoundation/tech-proposals/issues/48
140+
141+
The proposers believe that the community structure at the moment does not allow such an approach and there needs to be
142+
significant work to align goals, perception and priorities. Otherwise there will be too much friction.
143+
144+
The most important currency in the open source volunteer world is **energy**. It is not code or technical effort. As such we
145+
believe that the amount of saved energy by being more independent of upstream release processes and decisions far outweighs
146+
potential costs of technical redundancy/duplication.
147+
148+
A future proposal may very well attempt to create a unified user experience across the entire Haskell toolchain
149+
through joint management and collaboration. But that is not in the scope of this proposal and we have no concrete
150+
idea how to achieve that.
151+
152+
## Technical Content
153+
154+
We propose here to start with the smallest step possible, to build the entire Haskell toolchain autonomically.
155+
The way this will be implemented is to start a central GitHub repository that builds bindists for releases of:
156+
157+
- GHC
158+
- HLS
159+
- cabal
160+
- stack
161+
162+
For the following platforms:
163+
164+
- FreeBSD x86_64
165+
- Linux i386
166+
- Linux x86_64
167+
- Linux armv7
168+
- Linux aarch64
169+
- Darwin x86_64
170+
- Darwin aarch64
171+
- Windows x86_64
172+
173+
For the following Linux x86_64 distros:
174+
175+
- Debian
176+
- Ubuntu
177+
- Mint
178+
- Fedora
179+
- CentOS
180+
- RedHat
181+
- Rocky Linux
182+
- Void Linux
183+
- Amazon Linux
184+
- Alpine
185+
186+
Linux i386, armv7 and aarch64 will be confined to Debian or Ubuntu.
187+
188+
These private runners will be made available to the whole Haskell GitHub org and as such benefit
189+
other projects there as well (like HLS, Cabal, bytestring, etc.).
190+
191+
## Future work
192+
193+
The following ideas and goals are outside of the scope of this proposal, but are essential
194+
to understand the broader mission and roadmap that motivated this proposal.
195+
196+
### Enhancements to bindist quality and installation experience
197+
198+
Further goals are:
199+
200+
* enhance the quality of the bindists by
201+
- running the entire test suite for all of the tools
202+
- having a mechanism to report test failures back upstream
203+
- publishing test failures for end users to see
204+
- communicte test status of bindists clearly through e.g. GHCup
205+
- resolve GHC issues related to test bindists:
206+
* https://gitlab.haskell.org/ghc/ghc/-/issues/22726
207+
* https://gitlab.haskell.org/ghc/ghc/-/issues/22723
208+
* https://gitlab.haskell.org/ghc/ghc/-/issues/22727
209+
* make fixing bindists easier
210+
- implement revisions in GHCup: https://github.com/haskell/ghcup-hs/issues/361
211+
- make it easy to update an older GHC branch and re-run the release pipeline
212+
- Make building upstream release binaries easier
213+
* https://github.com/haskell/cabal/issues/9461
214+
* https://github.com/haskell/haskell-language-server/issues/3878
215+
216+
One main idea is that bindists should be primarily tested **on the users system**, because that is where they're going to run.
217+
It is great to know that e.g. the test suite passes on GHC CI, but that may have little value in different environments.
218+
Additionally, issues with tests can flow back to upstream developers and we may develop workflows and processes to streamline
219+
this type of feedback. Early release candidates can assist with this workflow.
220+
221+
Another perception shift necessary is that upstream projects should consider that their build system
222+
are end-user interfaces, making it easier for both distributors and end-user to build release binaries correctly themselves.
223+
224+
### Nightlies
225+
226+
We also want to make nightlies available for GHC and cabal and HLS. This will require
227+
coming up with a permanent storage solution and very robust nightly pipelines.
228+
229+
## Timeline
230+
231+
* 6 months: proof of concept of a central GitHub CI building most of the toolchain
232+
* 12 months: building GHC via github actions
233+
234+
## Funding
235+
236+
### Who and what
237+
238+
The GHCup project requests funding for private GitHub CI runners to power the midstream bindist release pipelines.
239+
It will receive and manage the funding in strict collaboration with the Haskell Foundation.
240+
241+
Volunteers who want to collaborate in midstream bindists are welcome to look at the project structure and collaboration guidelines:
242+
243+
* https://www.haskell.org/ghcup/about/#team
244+
* https://www.haskell.org/ghcup/dev/#contribution-process-and-expectations
245+
246+
### Budget
247+
248+
The following is an example/estimate of a budget (prices are in USD). The HF and the proposer will negotiate the exact terms in private.
249+
250+
- Linux/FreeBSD x86_64 runner on Hetzner (AX52)
251+
* monthly: $62.24
252+
* yearly: $746.88
253+
- Linux aarch64 runner on Hetzner (RX170)
254+
* monthly: $181.73
255+
* yearly: $2,180.76
256+
- Darwin runner on Hetzner (Mac Mini M1)
257+
* monthly: $56.59
258+
* yearly: $679.08
259+
260+
To host one private runner per all these platform, the yearly cost would be: **$3,606.72**
261+
262+
There will likely also be an initial setup cost (as is usual for Hetzner).
263+
264+
We may request more runners depending on the demand, so it may very well be 3 runners per platform, resulting in yearly cost of: $10,820.16
265+
266+
**As such, the budget estimated is between 4 to 10k USD per year.**
267+
268+
## Stakeholders
269+
270+
* GHCup developers, who receive funding
271+
* GHC developers
272+
* cabal developers
273+
* stack developers
274+
* HLS developers
275+
* VSCode Haskell developers
276+
* Haskell toolchain end users
277+
278+
## Success
279+
280+
* reliable, continuously maintained bindists, readily available
281+

0 commit comments

Comments
 (0)