-
-
Notifications
You must be signed in to change notification settings - Fork 423
Limit parallel Scala.js linking jobs to avoid high memory pressure #6260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Fix: com-lihaoyi#6226 I just hardcoded the limit to `2`.
72beb07 to
d2aa908
Compare
davesmith00000
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for picking this up @lefou. 🙏
It's kind of unsatisfying having to hardcore a value here, but I believe that setting a hardcoded sensible value is a better situation than people accidentally getting into an OutOfMemory blackhole by performing a common action, such as running all their tests.
| def scalaJSWorker: Worker[ScalaJSWorker] = Task.Worker { | ||
| new ScalaJSWorker( | ||
| jobs = Task.ctx().jobs, | ||
| linkerJobs = 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion: Perhaps this value could be exposed on ScalaJSModuleAPI? It can have a nice low default but allow people to tweak it to their needs or based on some environmental heuristic? E.g. They have a massive CI server and can afford to open up the parallelism.
Exposing it on the API also slightly improves the transparency around what's going on here, but perhaps this will need to be documented somehow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I already thought about how to configure it, but didn't want to overengineer it.
The natural place for a config task would be the ScalaJSWorker, which is currently not designed to be customized, in a way other worker are, for example the JvmWorkerModule. Also, since there are potentially more than one ScalaJSWorker, we would need to introduce a new shared worker, so this route isn't a trivial change.
What would be somewhat easier is accepting an environment variable.
Also, we should converge to a "sensible default". I don't work with Scala.JS often, so I have no "feeling" for what a good value might be. We might also apply some logic based on heuristics, which I don't have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what reasonable heuristics you could sensibly apply, and I suspect attempting to do that might be a lot of work for not a lot of reward. 🤷
FWIW, @lolgab was suggesting a concurrency of 1 in a discussion on Discord, and I'm using 2 in CI:
https://github.com/PurpleKingdomGames/indigoengine/blob/main/ci.sh#L9-L10
|
I'm just witnessing a fatal OOM in coursier release process: https://github.com/coursier/coursier/actions/runs/19949228402/job/57231194898 I think I want to merge this. |
|
On the bright side: At least it is reproducible, and we know what the problem is. 🙂 |
|
@lefou Are we sure this solves the problem? Because the problem is not about limiting the parallelism, but about limiting the memory consumption of the Scala.js linkers. Even if we limit at 2, we still have a cache that stores |
|
@lolgab I guess that depends on the implementation. Currently I work around the problem by manually forcing a concurrency limit, in order (to my naive understanding) to avoid the system choking / memory thrashing. https://github.com/PurpleKingdomGames/indigoengine/blob/main/ci.sh#L9-L10 |
|
TBH, i have no idea. I just try to solve a blocker issue based on the provided input. We probably also need to limit the cached jobs to the same size. Alternative or in addition, we could try to hold the cache in a soft or weak reference, so that the garbage collector has a change to evict unused instances. (We did this before for some caches, but I'm not sure, this code is still in place, since there were many rounds of refactoring since.) Question: What is a good default for parallel linker jobs? This PR uses '2', mostly as this was reported as a good number, but I have no idea what's reasonable. We should also add some metrics, so we better understand the error cases. |
This unfortunately doesn't work. We did it before, but it wasn't working because Scala.js needs a cleanup method to be called to clean the cache, otherwise it gets leaked. SoftReference caches can't call finalizers when they get garbage collected. |
|
@davesmith00000 Could you by any chance check, if this PR as-is fixes your issue (without applying your other workarounds, like limiting the |
|
Regarding the linker state. I don't know what the benefits of not clearing the linker are, but we should be able to auto-clean it after each use. That hopefully means, we don't hog unneeded memory, but still keep the JIT-ed classes around. |
|
But maybe you don't mean the linker state, but the |
This basically kills the benefits of having a worker, since the Scala.js linker becomes not incremental anymore. I'm thinking what is the best approach to avoid OOMs while keeping good parallelism and the incremental state. |
|
I guess we need some runtime stats, and decide based on total and/or relative memory consumption, what caches to keep and what to remove. Theoretically, there are various kind ofdata a worker can keep, but not all state might provide the same benefit of being kept. E.g. intermediate compile results can be written and read from disc, but still provide a benefit over re-computation of the whole result. In the end, a cache so large that it causes OOMs is worse than no cache at all. A classloader cache is much cheaper while ensuring high performance due to JIT-ed bytecode, than some in-memory cache of intermediate binary results. |
|
@lefou I thought I'd try quickly testing this during my lunch break, but my efforts have been hampered by the forced upgrade to Scala
Anyway, in terms of concurrently running |
|
Thank you @davesmith00000! I assume, before you were not able to have 8-10 link-tasks in parallel. I'll merge this PR in the hope it helps. At least, it shouldn't make things worse. We can address the ScalaJS worker cache in a separate PR. |
|
I've been trying to wrap my head around the ScalaJSWorker caching many times. It's complicated. We have a first layer of caching where we have an instance of Then we have a second layer of caching where we cache the linkers. For every On top of this, the way What I would want is a single limit that would be somehow shared by the two caches, so we keep control on the total number of linkers we instantiate, not only on the ones that we have in one of the Moreover, maybe that behavior of creating an instance and dropping it right away we have in |
Correct. Previously it would start 8-10 in parallel but grind to a halt. The behaviour I observe now is that it is completing what it can complete and only getting stuck on the troublesome module, which is some unrelated issue. |
| val res = Await.result(resultFuture, Duration.Inf) | ||
| linker match { | ||
| case cl: ClearableLinker => cl.clear() | ||
| case _ => // no-op | ||
| } | ||
| res |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change breaks Scala.js incremental linking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert it. The API docs don't tell, that this is related to incremental linking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can revert it.
| } | ||
| } | ||
|
|
||
| private val linkerJobLimiter = ParallelismLimiter(linkerJobs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should pass linkerJobs instead of jobs to
val bridge = cl
.loadClass("mill.scalajslib.worker.ScalaJSWorkerImpl")
.getDeclaredConstructor(classOf[Int])
.newInstance(jobs)
.asInstanceOf[workerApi.ScalaJSWorkerApi]Since we are running two linker jobs at a time to save memory, if we store 8 different ones in memory, we aren't saving as much memory as we want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My working hypothesis was, that the high memory usage is required while the linking is in process, but most of it gets freed afterwards. That means, by delaying/synchronizing linking jobs, we already reduce the memory pressure. #6260 (comment) seems to support or at least not counter support this hypothesis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider that the test was performed with the code to clear the linker after every link step, which makes sense to clean the linker memory afterwards. If we keep the linkers in memory and do not clear them, the test could give different results.
Fix: #6226
I just hardcoded the limit to
2for now.Pull request: #6260