Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi, I can't see local representations learning in main_lg.py. It just records the best local acc and the best local model, then updates the global model. #6

Open
wardseptember opened this issue Dec 18, 2020 · 6 comments

Comments

@wardseptember
Copy link

Hi, I can't see local representations learning in main_lg.py. It just records the best local acc and the best local model, then updates the global model. Did I understand it wrong? What did main_mtl.py do? I confuse it, looking forward you reply, thank you.

@wardseptember
Copy link
Author

I think this code can't reduce the number of parameters. The code does not implement the ideas in your paper. Is there a error with my understanding?

@terranceliu
Copy link
Collaborator

Hi, just to clarify, the algorithm is designed to reduce the number of parameters communicated each round, and then we evaluate the local and global models on various test sets (local and new). I'm not entirely sure about what you are trying to achieve, but for our experiments, each local model will end up with different parameters (since only some layers are receiving global updates) and thus learn different local representations.

main_mtl contains an implementation for a separate method (fed multi-task learning) and is unrelated to our method.

@wardseptember
Copy link
Author

wardseptember commented Dec 19, 2020

Thank you for your reply. I understand it, in fact, some layers received global updates.

w_glob_keys = net_glob.weight_keys[total_num_layers - args.num_layers_keep:]
w_glob_keys = list(itertools.chain.from_iterable(w_glob_keys))

I didn't look at the code carefully before. thank you again.

@TsingZ0
Copy link

TsingZ0 commented Jun 13, 2021

I think this code can't reduce the number of parameters. The code does not implement the ideas in your paper. Is there a error with my understanding?

@wardseptember After carefully checking the code and the reply above, I still have the same doubt too. Did you understand it yet?

@wardseptember
Copy link
Author

@TsingZ0 可以看下上面两行代码,只更新前面几层,并不是更新所有层参数。所以减少了交互参数

@jarumil
Copy link

jarumil commented Sep 10, 2021

I still have the same doubt as @wardseptember , in your algorithm you define the procedure with two different models, the local model (local representations), and the global model (to classify), which are trained independently in each client. However, in your code you simply train the complete model, and then separate the layers corresponding to the global model and the local model. Which of the two ways is really the correct one? Best regards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants