fix(framework): Report clientapp appio communication failures as errors by msheller · Pull Request #7061 · flwrlabs/flower

msheller · 2026-04-29T18:38:58Z

Minor fix for error checking corner case that could cause silent failures (success reported despite error).

This addresses the codex comment remaining in already merged PR 6986

Checklist

Implement proposed change
Write tests
Update documentation
Address LLM-reviewer comments, if applicable (e.g., GitHub Copilot)
Make CI checks pass
Ping maintainers on Slack (channel #contributions)

…mmunication-failures-as-errors

…s-as-errors

Copilot

Pull request overview

This PR fixes a corner case in the SuperNode ClientApp runtime where AppIO gRPC failures could still result in a success (ExitCode.SUCCESS) being reported, and introduces a dedicated non-zero exit code for such communication failures.

Changes:

Update run_clientapp to return a non-zero exit code when a grpc.RpcError occurs during AppIO communication.
Add ExitCode.CLIENTAPP_COMMUNICATION_ERROR = 250 and document it.
Add a unit test ensuring gRPC failures don’t report success.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`framework/py/flwr/supernode/runtime/run_clientapp.py`	Tracks an `exit_code` and switches it to a ClientApp communication error on `grpc.RpcError`.
`framework/py/flwr/supernode/runtime/run_clientapp_test.py`	Adds a test asserting `flwr_exit` is called with the new non-zero exit code on gRPC failure.
`framework/py/flwr/common/exit/exit_code.py`	Introduces the new ClientApp-specific exit code and help text; adjusts documented ServerApp range.
`framework/docs/source/ref-exit-codes/250.rst`	Documents exit code 250 and remediation guidance.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

panh99 · 2026-05-01T13:01:24Z

The implementation looks good to me, but I think the new exit-code docs are a little too user-action-oriented for an internal process. In normal runtime usage, users do not start flwr-clientapp directly; it is launched by SuperExec with the AppIO address/token/TLS settings already derived from the SuperNode/SuperExec setup.

Could we adjust this page to say that this means the internal ClientApp process (flwr-clientapp) could not communicate with the SuperNode ClientAppIo API, and that users should check the surrounding SuperNode/ClientApp logs for the underlying gRPC error? A likely cause is that the ClientApp process took too long to start, for example on a very slow or overloaded system, causing the short-lived token/heartbeat window to expire. If the log message is not enough to diagnose it, users should contact the Flower team with the relevant logs. Wdyt?

Updated! LMK if you want further changes!

…s-as-errors

panh99

LGTM!

…s-as-errors

panh99 and others added 30 commits April 9, 2026 13:47

init

0d37584

mv util

39d56b3

tweak

b35431b

tweak

6d43b39

Merge branch 'main' into rt-cert-for-apps

cf0e4a8

address comments

110869d

Merge branch 'main' into rt-cert-for-apps

bc99645

Merge branch 'main' into rt-cert-for-apps

8b3fc32

post-merge format

059aaeb

Update framework/py/flwr/supercore/tls.py

7f80179

fix

6852c8a

Merge branch 'main' into rt-cert-for-apps

7367ebc

Merge remote-tracking branch 'origin/main' into rt-cert-for-apps

1137cd3

fix merge

f19ee02

fix

42cd228

address comments

f81c952

rename

b99fd3e

./dev/format.sh

49da492

Merge branch 'main' into rt-cert-for-apps

acc1e95

Merge branch 'main' into rt-cert-for-apps

0dcb752

Merge branch 'main' into rt-cert-for-apps

686b20a

initial server-side TLS implementation

4977afd

second codex pass to tighten/simplify

543add2

Merge branch 'main' into tls-for-appio-servicers

b6568dd

Merge branch 'main' into rt-cert-for-apps

1c1f758

Merge branch 'rt-cert-for-apps' into tls-for-appio-servicers

3a74302

removed unneeded refactoring to minimize review delta

f5ce749

formatting and testing fixes

69b749d

improved comments

21f4c8a

fix to use option 2 (seperate certs)

07b50ae

msheller added 5 commits April 29, 2026 10:07

ready for manual review

e7f9c6a

Merge branch 'main' into tls-for-appio-servicers

f0e1c7d

docstrings updated to address latest codex review comment

57b0f2f

fixed merge conflict regression

64eeabe

Merge branch 'tls-for-appio-servicers' into report-clientapp-appio-co…

c7d2fcc

…mmunication-failures-as-errors

github-actions Bot added the Maintainer Used to determine what PRs (mainly) come from Flower maintainers. label Apr 29, 2026

Base automatically changed from tls-for-appio-servicers to main April 30, 2026 15:28

panh99 and others added 2 commits April 30, 2026 16:39

Merge branch 'main' into report-clientapp-appio-communication-failure…

ded519d

…s-as-errors

formatting fixes

3cb6210

msheller changed the title ~~Report clientapp appio communication failures as errors~~ fix(framework): Report clientapp appio communication failures as errors Apr 30, 2026

Merge branch 'main' into report-clientapp-appio-communication-failure…

8414a66

…s-as-errors

msheller marked this pull request as ready for review April 30, 2026 16:59

msheller requested review from panh99 and tanertopal as code owners April 30, 2026 16:59

Copilot AI review requested due to automatic review settings April 30, 2026 16:59

msheller requested a review from danieljanes as a code owner April 30, 2026 16:59

Copilot started reviewing on behalf of msheller April 30, 2026 16:59 View session

Copilot AI reviewed Apr 30, 2026

View reviewed changes

Comment thread framework/py/flwr/supernode/runtime/run_clientapp.py Outdated

Error message text consistency improvement

aeac3c3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

panh99 reviewed May 1, 2026

View reviewed changes

msheller added 4 commits May 4, 2026 07:17

Merge branch 'main' into report-clientapp-appio-communication-failure…

1594c93

…s-as-errors

Merge branch 'main' into report-clientapp-appio-communication-failure…

e5b3597

…s-as-errors

updated exit code 250 docs

9ab5cc2

removed erroneous (edited) text in 250.rst

15e5d9b

panh99 approved these changes May 6, 2026

View reviewed changes

Merge branch 'main' into report-clientapp-appio-communication-failure…

8c7ae8c

…s-as-errors

danieljanes enabled auto-merge (squash) May 6, 2026 15:21

danieljanes approved these changes May 6, 2026

View reviewed changes

danieljanes merged commit 442472a into main May 6, 2026
70 checks passed

danieljanes deleted the report-clientapp-appio-communication-failures-as-errors branch May 6, 2026 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(framework): Report clientapp appio communication failures as errors#7061

fix(framework): Report clientapp appio communication failures as errors#7061
danieljanes merged 51 commits into
mainfrom
report-clientapp-appio-communication-failures-as-errors

msheller commented Apr 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

panh99 May 1, 2026

Uh oh!

msheller May 5, 2026

Uh oh!

panh99 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

msheller commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

panh99 May 1, 2026

Choose a reason for hiding this comment

Uh oh!

msheller May 5, 2026

Choose a reason for hiding this comment

Uh oh!

panh99 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

msheller commented Apr 29, 2026 •

edited

Loading