Missing processes cause the FoundationDBClusterStatus to be out of sync with actual cluster status

### What happened?

In our k8s cluster, we sometimes have processes / nodes that are killed by other systems running in the cluster. When this happens, the node can disappear from the cluster. Sometimes, this seems to result in the the process staying in the FDBClusterStatus, despite no longer being part of the cluster.

We see these log lines:

```
skip updating fault domain for process group with missing process in FoundationDB cluster status
```

from the `updateStatus` reconciler, and the processGroupID is one that no longer exists in the cluster. This then causes problems in certain operations, such as updating pods because some of the reconcilers seem to iterate over the processes from the FDBClusterStatus and tries to fetch their details from k8s, but then they cannot find that pod.

We encountered this on operator version 2.3.0.

### What did you expect to happen?

I would expect processes that are no longer reported in the machine readable status to be removed from the FDBClusterStatus

### How can we reproduce it (as minimally and precisely as possible)?

I'm not totally sure because I don't have a exact replication, but I think you can just delete a pod or node from k8s while the cluster is running.

### Anything else we need to know?

_No response_

### FDB Kubernetes operator

<details>

FDB version 7.1.67
Operator version v2.3.0

</details>


### Kubernetes version

<details>

```console
$ kubectl version
Client Version: v1.32.1
Kustomize Version: v5.5.0
Server Version: v1.31.601
```

</details>


### Cloud provider

<details>
AWS
</details>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing processes cause the FoundationDBClusterStatus to be out of sync with actual cluster status #2292

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

FDB Kubernetes operator

Kubernetes version

Cloud provider

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing processes cause the FoundationDBClusterStatus to be out of sync with actual cluster status #2292

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

FDB Kubernetes operator

Kubernetes version

Cloud provider

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions