Skip to content

feature-request: trivial coordinator heartbeat router #58

@maxgruber19

Description

@maxgruber19

I'd like to have a very easy router that just sends a http get to the trino coordinator to know about its state.

we currently use a custom pythonscript router, make it curl https://trino-coordinator-default/v1/info and check for "starting" field to be false. this leads to a very very basic "queuing procedure" but clients die after couple of seconds when they dont get a feedback from the lb because its stuck in its routing loop. I'll attach a basic example below. Of course this scenario limits the routing functionality to one cluster only instead of multiple clusters dynamically.

The behavior Id like to propose is that the trino-lb should send back "QUEUED_IN_TRINO_LB" as long as its waiting for the coordinator to be alive again. Unfortunately I have no clue about rust, so I dont feel ready to propose some code myself.

If there already is something like that I'm very curious to know.

import time
from typing import Optional
import requests


def isCoordinatorReady():
  try:
    response = requests.get(
      "https://trino-coordinator-default.mesh-platform-core.svc.cluster.local:8443/v1/info",
      verify="/etc/secret-provisioner-tls/ca.crt"
    )
  except Exception as e:
    return False

  if response.status_code == 200 and not response.json()['starting']:
    return True
  return False


def targetClusterGroup(query: str, headers: dict[str, str]) -> Optional[str]:
  while not isCoordinatorReady():
    time.sleep(10)
  return "my-single-cluster"

Metadata

Metadata

Assignees

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions