Sometimes it can take a long time to pull container images. Try re-running the installer. Use kubectl get deployments --all-namespaces to identify failing deployments.
Try re-running the installer. Use kubectl get deployments --all-namespaces to identify failing deployments.
export DOCKER_HOST="unix://$HOME/.colima/docker.sock"On macOS, if Privacy and Content Restrictions are enabled (under
System Settings → Screen Time → Content & Privacy Restrictions),
then after the Kagenti installation completes, opening the UI may display a blank loading page.
To fix, disable these restrictions and restart the UI.
The install script expects docker to be in your runtime path.
A few problem fixes might include:
-
create
/usr/local/bin/dockerlink to podman:sudo ln -s /opt/podman/bin/podman /usr/local/bin/docker -
install
docker-credential-helper:brew install docker-credential-helper -
fix an issue with
insufficient memory to start keycloak:podman machine stop podman machine set --memory=12288 --cpus=8 podman machine start
-
clean, fresh Podman start:
podman machine rm -f podman machine init podman machine set --memory=12288 --cpus=8 podman machine start
-
clean the cluster, keep the Podman VM as is:
kind delete cluster --name agent-platform
If you see Init:ErrImagePull or Init:ImagePullBackOff errors while deploying components,
most likely your Github token expired.
Error text:
failed to authorize: failed to fetch oauth token: unexpected status from GET request to https://ghcr.io/token?scope=repository%3Akagenti%2Fkagenti-client-registration%3Apull&service=ghcr.io: 403 ForbiddenCheck your personal access token (classic).
Make sure to grant scopes all:repo, write:packages, and read:packages.
You may also get "ghcr.io: 403 Forbidden" errors installing Helm charts during Kagenti installation. You may have cached credentials that are no longer valid. The fix is docker logout ghcr.io.
It may happens with Keycloak or even the UI.
Restart the following:
kubectl rollout restart daemonset -n istio-system ztunnel
kubectl rollout restart -n kagenti-system deployment http-istioIf you need to update the values in charts/kagenti/.secrets.yaml file, e.g., githubToken,
delete the secret in all your auto-created namespaces, then re-run the installer:
kubectl get secret --all-namespaces
kubectl -n my-namespace delete github-token-secret
scripts/kind/setup-kagenti.shKagenti UI shows Connection errors:
An unexpected error occurred during A2A chat streaming: HTTP Error 503: Network communication error: peer closed connection without sending complete message body (incomplete chunked read)Agent log shows errors:
kagenti$ kubectl -n teams logs -f weather-service-7f984f478d-4jzv9
.
.
ERROR: Exception in ASGI application
+ Exception Group Traceback (most recent call last):
| File "/app/.venv/lib/python3.11/site-packages/uvicorn/protocols/http/h11_impl.py", line 403, in run_asgi
| result = await app( # type: ignore[func-returns-value]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| ..
+-+---------------- 1 ----------------
| urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))Most likely the A2A protocol is failing because ollama service is not installed or running.
Start ollama service in the terminal and keep it running:
ollama serveThen try the prompt again.
Keycloak stops working and logs show connection errors.
At this time there is no reliable sequence of bringing down and up again postgres and keycloak. The only reliable approach found so far is either to destroy and re-install the cluster or delete and re-install keycloak as follows:
# Delete and re-apply keycloak resources
helm uninstall keycloak -n keycloak
scripts/kind/setup-kagenti.sh
# Restart related services
kubectl rollout restart daemonset -n istio-system ztunnel
kubectl rollout restart -n kagenti-system deployment http-istio
kubectl rollout restart -n kagenti-system deployment kagenti-uiDeployed agents may need to be restarted to update the Keycloak client.
kubectl rollout restart -n <agent-namespace> deployment <agent-deployment e.g. weather-service>When running the kagenti helm chart upgrade, you may encounter an error stating failed calling webhook "webhook.cert-manager.io" because the x509 certificate has expired. This occurs when the internal certificates used by cert-manager to communicate with the Kubernetes API server are no longer valid, preventing the validation of resources like Certificates and Issuers.
To resolve this, you must force cert-manager to regenerate its internal CA and certificates by following these steps:
Delete the expired webhook secret:
kubectl delete secret cert-manager-webhook-ca -n cert-managerRestart the cert-manager deployments to trigger the issuance of new certificates:
kubectl rollout restart deployment cert-manager -n cert-manager
kubectl rollout restart deployment cert-manager-webhook -n cert-manager
kubectl rollout restart deployment cert-manager-cainjector -n cert-managerVerify the pods are healthy before retrying your Helm command:
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=cert-manager -n cert-manager --timeout=60sOnce the pods are back in a Running state with valid certificates, your helm upgrade --install
command should complete successfully.