Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Istio Fork for Fixing Multicluster Registry Bug #1

Draft
wants to merge 7 commits into
base: release-1.0
Choose a base branch
from

Conversation

catherinetcai
Copy link

No description provided.

FAIR.md Outdated
@@ -0,0 +1,32 @@
# Why this Fork
In a mutual TLS-disabled multicluster Kubernetes setup, gRPC liveness and readiness probes don't work in remote clusters. However, this behavior could not be replicated on the Kubernetes cluster running the Istio control plane.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not grpc specific

FAIR.md Outdated

After diving into the Istio source, we only found that 3333 and 9999 were ports returned from the [Envoy debug registry](https://github.com/istio/istio/blob/1.0.2/pilot/pkg/proxy/envoy/v2/debug.go#L324-L337).

In the Istio logs, we could see that a remote cluster registry was being picked up. However, based off of the [server bootup order]*(https://github.com/istio/istio/blob/1.0.2/pilot/pkg/bootstrap/server.go#L211-L219) (note that Discovery service is always init'd before Multicluster), it looked like the debug registry would always be loaded before the multicluster ones. Since the debug registry would always return entries for the management ports, it meant that the debug management ports would always be used instead of the remote cluster ones, meaning that liveness and readiness probes for remote clusters would NEVER work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've got a * which is breaking the link [server bootup order]*(https://...) should be [server bootup order](https://...)

Copy link

@marshallbrekka marshallbrekka Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would elaborate a tiny bit to explain that their are various providers that the aggregate controller uses to get ManagementPorts, and that it iterates over them sequently, taking the first one that returns a non-nil result.
And then further highlight that the MultiCluster one is appended AFTER the debug one, which means that the remote one will never be called since debug always returns a value.
You kind of already said that, but just being explicit about how those providers are iterated over i think makes it clearer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa this is really weird. I don't see that in my text editor (the *). Good catch.

@catherinetcai catherinetcai changed the title Task/cat/add more logging Istio Fork for Fixing Multicluster Registry Bug Mar 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants