Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Certify collaborator step (in case of TLS) internally registers it, need to separate them out #1412

Open
noopurintel opened this issue Feb 28, 2025 · 5 comments
Assignees

Comments

@noopurintel
Copy link
Collaborator

noopurintel commented Feb 28, 2025

Describe the bug
Right now, in case of TLS, fx collaborator certify step internally registers collaborators.
This leads to confusion while running a non-TLS experiment where we need to perform collaborator registration separately as it isn't mentioned anywhere.

To Reproduce
Steps to reproduce the behavior:

  1. Refer to https://openfl.readthedocs.io/en/latest/tutorials/taskrunner.html
  2. Git clone openfl and install all the pre-requisites.
  3. Create workspace using fx workspace create --template torch_cnn_mnist --prefix my_workspace
  4. Move inside my_workspace folder
  5. Modify the plan/plan.yaml and assign use_tls: false (default value is true)
  6. Initialise the plan.
  7. Skip the certify steps at aggregator and collaborators level.
  8. Create 2 collaborators - keep the names different from usual collaborator1/2 to easily identify the problem.
  9. Run fx collaborator create -n a1 -d 1 and fx collaborator create -n a2 -d 2
  10. At this point, plan/data.yaml gets updated with two entries a1,1 and a2,2
  11. Start the collaborators and aggregator.
  12. Collaborators a1 and a2 will start and keep waiting for the aggregator whereas aggregator will result in below error.
    ValueError: Expected sequence membership, but a2 is not in ['collaborator1', 'collaborator2']

Expected behavior

  1. There should be a step to add/register collaborators before we create them irrespective of TLS/non TLS scenarios.
  2. With above step in place, the experiment should run smoothly without any error.
@noopurintel noopurintel changed the title Collaborator certify step internally registers collaborators, need to separate them out Certify collaborator step internally registers it, need to separate them out Feb 28, 2025
@noopurintel noopurintel changed the title Certify collaborator step internally registers it, need to separate them out Certify collaborator step (in case of TLS) internally registers it, need to separate them out Feb 28, 2025
@teoparvanov
Copy link
Collaborator

Hi @noopurintel , if I understand correctly, the concern is that you need to manually add the collaborator1 and collaborator2 entries to cols.yaml when bootstrapping a non-TLS TaskRunner experiment?

@noopurintel
Copy link
Collaborator Author

Hi @noopurintel , if I understand correctly, the concern is that you need to manually add the collaborator1 and collaborator2 entries to cols.yaml when bootstrapping a non-TLS TaskRunner experiment?

You are right. Here, the expectation is to have "certify_collaborator" and "register_collaborator" as two separate functions.
TLS will call both whereas non TLS will call the latter.

Importantly, there is no documentation or atleast TO-DOs available for non TLS scenario.

@teoparvanov
Copy link
Collaborator

teoparvanov commented Mar 6, 2025

It's true that there isn't official documentation for setting up a federation without TLS. I wonder though, what would be the use case for a non-TLS federation, except local testing? In FL experiments that deal with any kind of sensitive data it is paramount to protect the communication between the aggregator and collaborators, because leaking intermediate model updates can reveal more about the private data than the final model itself.

As a workaround, we may consider adding a "non-TLS" flavor of the quickstart guide that clarifies the additional required step of populating cols.yaml. WDYT?

@noopurintel
Copy link
Collaborator Author

It's true that there isn't official documentation for setting up a federation without TLS. I wonder though, what would be the use case for a non-TLS federation, except local testing? In FL experiments that deal with any kind of sensitive data it is paramount to protect the communication between the aggregator and collaborators, because leaking intermediate model updates can reveal more about the private data than the final model itself.

As a workaround, we may consider adding a "non-TLS" flavor of the quickstart guide that clarifies the additional required step of populating cols.yaml. WDYT?

Yes, that would do the needful without having to change the code.

@noopurintel
Copy link
Collaborator Author

Raised this PR #1442 to document the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants