Limit parallel Scala.js linking jobs to avoid high memory pressure #6260

lefou · 2025-11-27T17:20:56Z

I just hardcoded the limit to 2 for now.

Pull request: #6260

Fix: com-lihaoyi#6226 I just hardcoded the limit to `2`.

davesmith00000

LGTM, thanks for picking this up @lefou. 🙏

It's kind of unsatisfying having to hardcore a value here, but I believe that setting a hardcoded sensible value is a better situation than people accidentally getting into an OutOfMemory blackhole by performing a common action, such as running all their tests.

davesmith00000 · 2025-11-28T08:08:35Z

libs/scalajslib/src/mill/scalajslib/worker/ScalaJSWorker.scala

+  def scalaJSWorker: Worker[ScalaJSWorker] = Task.Worker {
+    new ScalaJSWorker(
+      jobs = Task.ctx().jobs,
+      linkerJobs = 2


Just a suggestion: Perhaps this value could be exposed on ScalaJSModuleAPI? It can have a nice low default but allow people to tweak it to their needs or based on some environmental heuristic? E.g. They have a massive CI server and can afford to open up the parallelism.

Exposing it on the API also slightly improves the transparency around what's going on here, but perhaps this will need to be documented somehow?

Yeah, I already thought about how to configure it, but didn't want to overengineer it.

The natural place for a config task would be the ScalaJSWorker, which is currently not designed to be customized, in a way other worker are, for example the JvmWorkerModule. Also, since there are potentially more than one ScalaJSWorker, we would need to introduce a new shared worker, so this route isn't a trivial change.

What would be somewhat easier is accepting an environment variable.

Also, we should converge to a "sensible default". I don't work with Scala.JS often, so I have no "feeling" for what a good value might be. We might also apply some logic based on heuristics, which I don't have.

I'm not sure what reasonable heuristics you could sensibly apply, and I suspect attempting to do that might be a lot of work for not a lot of reward. 🤷

FWIW, @lolgab was suggesting a concurrency of 1 in a discussion on Discord, and I'm using 2 in CI:
https://github.com/PurpleKingdomGames/indigoengine/blob/main/ci.sh#L9-L10

lefou · 2025-12-05T09:27:49Z

I'm just witnessing a fatal OOM in coursier release process: https://github.com/coursier/coursier/actions/runs/19949228402/job/57231194898

 [3549] core.js[2.12.20].resolvedMvnDeps 658s
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid2482.hprof ...
Heap dump file created [6755211786 bytes in 76.676 secs]
Exception in thread "Process ID Checker Thread" java.lang.OutOfMemoryError: Java heap space
	at java.base/jdk.internal.misc.Unsafe.allocateInstance(Native Method)
	at java.base/java.lang.invoke.DirectMethodHandle.allocateInstance(DirectMethodHandle.java:501)
	at java.base/java.lang.invoke.DirectMethodHandle$Holder.newInvokeSpecial(DirectMethodHandle$Holder)
	at java.base/java.lang.invoke.Invokers$Holder.linkToTargetMethod(Invokers$Holder)
	at mill.server.Server$.checkProcessIdFile(Server.scala:484)
	at mill.server.Server$.$anonfun$7(Server.scala:511)
	at mill.server.Server$$$Lambda/0x00007f12b00ffd38.run(Unknown Source)
	at java.base/java.lang.Thread.runWith(Thread.java:1596)
	at java.base/java.lang.Thread.run(Thread.java:1583)

I think I want to merge this.

davesmith00000 · 2025-12-05T09:53:54Z

On the bright side: At least it is reproducible, and we know what the problem is. 🙂

lolgab · 2025-12-05T10:10:08Z

@lefou Are we sure this solves the problem? Because the problem is not about limiting the parallelism, but about limiting the memory consumption of the Scala.js linkers. Even if we limit at 2, we still have a cache that stores jobs linkers. So we could still reach the same amount of memory to be allocated, no?

davesmith00000 · 2025-12-05T10:18:03Z

@lolgab I guess that depends on the implementation. Currently I work around the problem by manually forcing a concurrency limit, in order (to my naive understanding) to avoid the system choking / memory thrashing.

https://github.com/PurpleKingdomGames/indigoengine/blob/main/ci.sh#L9-L10

lefou · 2025-12-05T10:19:07Z

TBH, i have no idea. I just try to solve a blocker issue based on the provided input.

We probably also need to limit the cached jobs to the same size. Alternative or in addition, we could try to hold the cache in a soft or weak reference, so that the garbage collector has a change to evict unused instances. (We did this before for some caches, but I'm not sure, this code is still in place, since there were many rounds of refactoring since.)

Question: What is a good default for parallel linker jobs? This PR uses '2', mostly as this was reported as a good number, but I have no idea what's reasonable.

We should also add some metrics, so we better understand the error cases.

lolgab · 2025-12-05T10:25:37Z

Alternative or in addition, we could try to hold the cache in a soft or weak reference, so that the garbage collector has a change to evict unused instances. (We did this before for some caches, but I'm not sure, this code is still in place, since there were many rounds of refactoring since.)

This unfortunately doesn't work. We did it before, but it wasn't working because Scala.js needs a cleanup method to be called to clean the cache, otherwise it gets leaked. SoftReference caches can't call finalizers when they get garbage collected.

lefou · 2025-12-05T10:48:23Z

@davesmith00000 Could you by any chance check, if this PR as-is fixes your issue (without applying your other workarounds, like limiting the --jobs.)?

lefou · 2025-12-05T10:49:49Z

cc @alexarchambault for awareness.

https://github.com/com-lihaoyi/mill/pulls#issuecomment-3615994465

lefou · 2025-12-05T11:42:28Z

Regarding the linker state. I don't know what the benefits of not clearing the linker are, but we should be able to auto-clean it after each use. That hopefully means, we don't hog unneeded memory, but still keep the JIT-ed classes around.

lefou · 2025-12-05T11:49:46Z

But maybe you don't mean the linker state, but the IRFileCache.Cache. Don't know what the best to do here.

lolgab · 2025-12-05T12:29:35Z

Regarding the linker state. I don't know what the benefits of not clearing the linker are, but we should be able to auto-clean it after each use. That hopefully means, we don't hog unneeded memory, but still keep the JIT-ed classes around.

This basically kills the benefits of having a worker, since the Scala.js linker becomes not incremental anymore.

I'm thinking what is the best approach to avoid OOMs while keeping good parallelism and the incremental state.

lefou · 2025-12-05T13:05:38Z

I guess we need some runtime stats, and decide based on total and/or relative memory consumption, what caches to keep and what to remove. Theoretically, there are various kind ofdata a worker can keep, but not all state might provide the same benefit of being kept. E.g. intermediate compile results can be written and read from disc, but still provide a benefit over re-computation of the whole result. In the end, a cache so large that it causes OOMs is worse than no cache at all.

A classloader cache is much cheaper while ensuring high performance due to JIT-ed bytecode, than some in-memory cache of intermediate binary results.

davesmith00000 · 2025-12-05T13:12:02Z

@lefou I thought I'd try quickly testing this during my lunch break, but my efforts have been hampered by the forced upgrade to Scala 3.8.0-RC1 that Mill requires:

3.8.0-RC1 seems to have some weird behaviours around unused code (sometimes it's wrong...).
3.8.0-RC1 has reclassified some warnings, it seems, so a lot of patching was required.
I don't understand the relationship between Mill Scala 3 version, my plugin's Scala 3 version, and my main projects Scala 3 version. Currently if they aren't aligned, bad things happen.
One of my module's tests now refuse to compile.

Anyway, in terms of concurrently running fastOptJS, it seems better. I can't be 100% sure until I fix point (4) above, but it was happily linking 8-10 modules concurrently at one point.

lefou · 2025-12-05T13:24:19Z

Thank you @davesmith00000! I assume, before you were not able to have 8-10 link-tasks in parallel.

I'll merge this PR in the hope it helps. At least, it shouldn't make things worse. We can address the ScalaJS worker cache in a separate PR.

lolgab · 2025-12-05T13:24:34Z

I've been trying to wrap my head around the ScalaJSWorker caching many times. It's complicated.
To recap, this is what we have.

We have a first layer of caching where we have an instance of ScalaJSWorkerImpl for every different Scala.js classloader.
So, more or less, we have an entry for every different scalaJSVersion we have in the process, with a maximum of ctx.jobs.

Then we have a second layer of caching where we cache the linkers. For every ScalaJSWorkerImpl instance, we have ctx.jobs linkers for ~every entry of the module/isFullLinkJS matrix.

On top of this, the way mill.util.CachedFactory works is that if you request more entries than maxCacheSize, it allocates a new linker, links, and then disposes it right away.

What I would want is a single limit that would be somehow shared by the two caches, so we keep control on the total number of linkers we instantiate, not only on the ones that we have in one of the ScalaJSWorkerImpls that are instantiated.

Moreover, maybe that behavior of creating an instance and dropping it right away we have in mill.util.CachedFactory is part of the problem. Maybe it should keep an internal semaphore as the one you implemented for Scala.js and limit who tries to create more entries than maxCacheSize?

davesmith00000 · 2025-12-05T13:28:50Z

I assume, before you were not able to have 8-10 link-tasks in parallel.

Correct. Previously it would start 8-10 in parallel but grind to a halt.

The behaviour I observe now is that it is completing what it can complete and only getting stuck on the troublesome module, which is some unrelated issue.

lolgab · 2025-12-05T13:30:32Z

libs/scalajslib/worker/1/src/mill/scalajslib/worker/ScalaJSWorkerImpl.scala

+    val res = Await.result(resultFuture, Duration.Inf)
+    linker match {
+      case cl: ClearableLinker => cl.clear()
+      case _ => // no-op
+    }
+    res


This change breaks Scala.js incremental linking.

I can revert it. The API docs don't tell, that this is related to incremental linking.

I can revert it.

lolgab · 2025-12-05T13:33:33Z

libs/scalajslib/src/mill/scalajslib/worker/ScalaJSWorker.scala

    }
  }

+  private val linkerJobLimiter = ParallelismLimiter(linkerJobs)


I think you should pass linkerJobs instead of jobs to

val bridge = cl .loadClass("mill.scalajslib.worker.ScalaJSWorkerImpl") .getDeclaredConstructor(classOf[Int]) .newInstance(jobs) .asInstanceOf[workerApi.ScalaJSWorkerApi]

Since we are running two linker jobs at a time to save memory, if we store 8 different ones in memory, we aren't saving as much memory as we want.

My working hypothesis was, that the high memory usage is required while the linking is in process, but most of it gets freed afterwards. That means, by delaying/synchronizing linking jobs, we already reduce the memory pressure. #6260 (comment) seems to support or at least not counter support this hypothesis.

Consider that the test was performed with the code to clear the linker after every link step, which makes sense to clean the linker memory afterwards. If we keep the linkers in memory and do not clear them, the test could give different results.

This reverts commit 5e8d01b.

lefou added 2 commits November 27, 2025 18:26

Limit parallel Scala.js linking jobs to avoid high memory pressure

e85906b

Fix: com-lihaoyi#6226 I just hardcoded the limit to `2`.

Clenaup

d2aa908

lefou force-pushed the tr-scalajs-linker-parallel branch from 72beb07 to d2aa908 Compare November 27, 2025 17:26

typo

8fe304b

lefou mentioned this pull request Nov 27, 2025

Issue #6206: Add withParallel Scala.js linker option #6207

Draft

davesmith00000 approved these changes Nov 28, 2025

View reviewed changes

lefou requested a review from lolgab November 28, 2025 09:44

lihaoyi force-pushed the main branch from 11736b6 to 81fab4f Compare December 4, 2025 09:09

Clear the linker state after each run

5e8d01b

lolgab reviewed Dec 5, 2025

View reviewed changes

lefou added 3 commits December 5, 2025 14:53

Revert "Clear the linker state after each run"

fa300b4

This reverts commit 5e8d01b.

Merge branch 'main' into tr-scalajs-linker-parallel

ae3fd39

Merge branch 'main' into tr-scalajs-linker-parallel

20fbd87

Uh oh!

Limit parallel Scala.js linking jobs to avoid high memory pressure #6260

Are you sure you want to change the base?

Limit parallel Scala.js linking jobs to avoid high memory pressure #6260

Conversation

lefou commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davesmith00000 left a comment

Choose a reason for hiding this comment

Uh oh!

davesmith00000 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

lefou Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davesmith00000 Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

lefou commented Dec 5, 2025

Uh oh!

davesmith00000 commented Dec 5, 2025

Uh oh!

lolgab commented Dec 5, 2025

Uh oh!

davesmith00000 commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lolgab commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lefou commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025

Uh oh!

lolgab commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

davesmith00000 commented Dec 5, 2025

Uh oh!

lefou commented Dec 5, 2025

Uh oh!

lolgab commented Dec 5, 2025

Uh oh!

davesmith00000 commented Dec 5, 2025

Uh oh!

lolgab Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

lefou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

lefou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

lolgab Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lefou Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

lolgab Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

lefou commented Nov 27, 2025 •

edited

Loading

lefou Nov 28, 2025 •

edited

Loading

lefou commented Dec 5, 2025 •

edited

Loading

lefou commented Dec 5, 2025 •

edited

Loading

lefou commented Dec 5, 2025 •

edited

Loading

lolgab Dec 5, 2025 •

edited

Loading