Skip to content

Commit f077d49

Browse files
authored
feat(engine): Improve execution stalls troubleshooting, align dev and prod behavior, adding heartbeats.yield utility (#2489)
* feat(engine): Improve execution stalls troubleshooting, align dev and prod behavior, adding heartbeats.yield utility * A few improvements via the 🐇 review * Allow treating EXECUTION stalls as OOM errors, improve the error message, add more information to the docs, improve resource monitor and add it to the docs * Add changeset
1 parent 5db583b commit f077d49

File tree

28 files changed

+1733
-238
lines changed

28 files changed

+1733
-238
lines changed

.changeset/soft-candles-grow.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
"@trigger.dev/sdk": patch
3+
"trigger.dev": patch
4+
---
5+
6+
Added the heartbeats.yield utility to allow tasks that do continuous CPU-heavy work to heartbeat and continue running

apps/webapp/app/env.server.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -519,8 +519,8 @@ const EnvironmentSchema = z
519519
RUN_ENGINE_WORKER_IMMEDIATE_POLL_INTERVAL: z.coerce.number().int().default(100),
520520
RUN_ENGINE_TIMEOUT_PENDING_EXECUTING: z.coerce.number().int().default(60_000),
521521
RUN_ENGINE_TIMEOUT_PENDING_CANCEL: z.coerce.number().int().default(60_000),
522-
RUN_ENGINE_TIMEOUT_EXECUTING: z.coerce.number().int().default(60_000),
523-
RUN_ENGINE_TIMEOUT_EXECUTING_WITH_WAITPOINTS: z.coerce.number().int().default(60_000),
522+
RUN_ENGINE_TIMEOUT_EXECUTING: z.coerce.number().int().default(300_000), // 5 minutes
523+
RUN_ENGINE_TIMEOUT_EXECUTING_WITH_WAITPOINTS: z.coerce.number().int().default(300_000), // 5 minutes
524524
RUN_ENGINE_TIMEOUT_SUSPENDED: z.coerce
525525
.number()
526526
.int()
@@ -735,6 +735,7 @@ const EnvironmentSchema = z
735735
RUN_ENGINE_RUN_QUEUE_LOG_LEVEL: z
736736
.enum(["log", "error", "warn", "info", "debug"])
737737
.default("info"),
738+
RUN_ENGINE_TREAT_PRODUCTION_EXECUTION_STALLS_AS_OOM: z.string().default("0"),
738739

739740
/** How long should the presence ttl last */
740741
DEV_PRESENCE_SSE_TIMEOUT: z.coerce.number().int().default(30_000),

apps/webapp/app/v3/runEngine.server.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ function createRunEngine() {
1515
prisma,
1616
readOnlyPrisma: $replica,
1717
logLevel: env.RUN_ENGINE_WORKER_LOG_LEVEL,
18+
treatProductionExecutionStallsAsOOM:
19+
env.RUN_ENGINE_TREAT_PRODUCTION_EXECUTION_STALLS_AS_OOM === "1",
1820
worker: {
1921
disabled: env.RUN_ENGINE_WORKER_ENABLED === "0",
2022
workers: env.RUN_ENGINE_WORKER_COUNT,
736 KB
Loading
717 KB
Loading

0 commit comments

Comments
 (0)