Support multiple host names for FLARE server #3018

yanchengnv · 2024-10-09T19:39:03Z

Fixes # .

Description

Currently, the FLARE Server's host name can only be specified with the "name" attribute of the server in project.yml. When the server cert is generated, this name is also used as the Common Name (CN) of the cert. There are several problems with this:

The max length for CN is 63 chars. If the host name is longer than that, then a cert cannot be generated.
There are cases that multiple host names may be desired for the FLARE server (e.g. internal clients and external clients may need to use different host names).
IP addresses cannot be used as host name since CN does not allow that. But there are cases that IP addresses are desired.

This PR solves all these 3 issues:

Multiple host names (host_names) can be specified in the "server" element in the project.yml. This is a list of host names or IP addresses. The CN will continue to be treated as a host name, to be backward compatible. When generating server cert, these host names will be included in the SubjectAlternativeNames extension.

When defined as "host_names", the max length for each value is 253 chars (much larger than 63 chars).

For client and admin, you can specify the "connect_to" attribute to select which host to use to connect to the FLARE server. Of course, the value of "connect_to" must be either the server's "name" or one of values in the server's "host_names" list.
When startup config files are generated for clients and admin users, the sp_end_point value (for dummy agent) will use the "connect_to" value if specified.

NOTE: this PR only implements the support of multiple host names with CLI provision. Dashboard needs to be updated later to make this available for dashboard-based provision.

Types of changes

Non-breaking change (fix or new feature that would not break existing functionality).
Breaking change (fix or new feature that would cause existing functionality to change).
New tests added to cover the changes.
Quick tests passed locally by running ./runtest.sh.
In-line docstrings updated.
Documentation updated.

nvidianz

LGTM

nvidianz · 2024-10-10T15:33:37Z

/build

…are into support_multi_host_names

IsaacYangSLA · 2024-10-11T15:51:02Z

/build

IsaacYangSLA

Thanks for the PR. LGTM.

…are into support_multi_host_names

yanchengnv · 2024-10-11T20:01:03Z

/build

IsaacYangSLA · 2024-10-11T20:50:18Z

/build

IsaacYangSLA

Approve this again. LGTM.

IsaacYangSLA · 2024-10-11T20:58:35Z

/build

yanchengnv · 2024-10-11T21:02:15Z

/build

* support server side custom scripts (#2695) * update notebooks due to the simulator changes (#2696) * update notebooks due to the simulator changes the output now is located at ```<workspace>/server/simulated_job``` instead ```<workspace>/simulated_job``` * update notebooks due to the simulator changes the output now is located at ```<workspace>/server/simulated_job``` instead ```<workspace>/simulated_job``` * Fix DAM Unit Test (#2692) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Fixed dam_test.c error when no xgboost is installed * Fixed a format issue --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update version number MONAI and the bundle version (#2702) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add Hierarchical Stats example (#2694) * Update Hello Client Controlled Workflow(CCWF) README.md (#2709) The folder path in the command is incorrect. * Update stats READMEs (#2711) This changes adds federated hierarchical stats example link in `examples/advanced/README.md` and changes images size in `hierarchical_stats/README.md` as the images were appearing smaller in the web browser. Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix torch ddp (#2706) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Cherry pick RM fix from #2667 (#2700) * Update ClientAlgo (#2566) (#2705) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix ClientAPILauncherExecutor import path to remove torch dependency. (#2713) * Fix ClientAPILauncherExecutor import path to remove torch dependency. * Update Hello Client Controlled Workflow(CCWF) README.md (#2709) The folder path in the command is incorrect. * Update stats READMEs (#2711) This changes adds federated hierarchical stats example link in `examples/advanced/README.md` and changes images size in `hierarchical_stats/README.md` as the images were appearing smaller in the web browser. Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix torch ddp (#2706) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> --------- Co-authored-by: tonywjs <130029822+tonywjs@users.noreply.github.com> Co-authored-by: Arun Patole <apatole@nvidia.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * set executor task (#2627) * Use ReliableMessage from 2.4 (#2717) * Enhance CLI command config (#2716) * Add CrossSiteEval with ModelController (#2699) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Enhance job auth setup script (#2715) * Merging XGBoost changes from 2.4 (#2712) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Merged XGB changes made in 2.4 to main * Fixed a format error * Undid change to histogram_based/executor.py * Addressed comments in PR --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * fix race condition handling (#2728) * Remove serialization of pfx (#2721) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * update readme link to website (#2734) * fix bcast manager min responses=0 (#2733) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix cryptography encrypt error (#2732) * keep the local resources for simulator (#2730) * keep the local resources for simulator. * fixed the local folder deploy. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Support same app for all sites in Job API (#2714) * support same app to all * add to_server() and to_clients() routines * comment out export * improve input errors handling * check for missing server components * address comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Fix overseer test timing (#2743) * Add ModelController documentation (#2707) * add ModelController docs * address comments * address comments 2 * fix code block --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * [2.5] TIE (Technology for Integrating Everything) and Flower Inegration (#2523) * added TIE * add license text * fix fstr * support cli applet * add tli applet * develop flower integration * added license text * generate cli cmd by applet * integrate with flower * fix format * fix fl ctx * fix get_command * run hello-flwr-pt job (#7) * run hello-flwr-pt job * remove print outs * abort grpc gracefully * fix example * graceful shutdown of flower * fix msg release * fix formatting * fix formatting * fix formatting * check applet stop * update flwr server commands (#8) * test superlink ready before starting server app * improve log file handling * remove unused import * fixed _superlink_process var bug * change namespace for flower proto; log flower msgs to file and console * add license text * consolidate process mgr * improve docstrings * address pr review issues * address additional pr comments * changed to use flwr proto directly * use PyApplet for running py code * added PyApplet * support server app args; address pr issues * move ccreate_channel to grpc_utils * fix flower output formatting * reformat --------- Co-authored-by: Holger Roth <6304754+holgerroth@users.noreply.github.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add MetricsSender docstring (#2745) * Update MONAI example README (#2724) * fix clone to keep original (#2755) * Bump up the version of monai-nvflare package to 0.2.8 (#2749) Also update its nvflare version to ~=2.5.0rc1, monai to >=1.3.1 * Update getting_started.rst (#2737) * Update getting_started.rst * No need to mkdir With mkdir, the copied folder has structure simulator-example/hello-pt/jobs, while without mkdir, the copied folder has structure simulator-example/jobs * Update getting_started.rst * Add hello-pt to the folder structure --------- Co-authored-by: Sean Yang <seany314@gmail.com> * Add CIFAR 10 examples for Tensorflow-based FedAvg & FedOpt (#2704) * add alpha splitting * run experiments * add tensorboard writers; increase model size * fedopt version * add fedprox loss and callback * Update ModerateTFNet to match CIFAR10 torch implementation. * Fix multiprocessing GPU init error. Handle no alpha split case. * Add preprocessing to match torch CIFAR10 result. * Unify executor script for different algos. * Remove unused codes. * Add preprocessing steps to make TF results on par with torch examples. * Fix script executor args. * Add script to run all experiments. * Add README. * Fix graphs in README. * Modify TF FedOpt controller. * Update README and FedOpt result. * Remove duplicated flare init. * Fix result graph for centralized vs FedAvg. * Fix README re. alpha value for centralized training. * Improve README. * Add workspace arg. Change min_clients to num_clients. * Add warning on TF GPU vRAM allocation. * Clean up TB summary logs. * Remove FedProx which will be implemented in another PR. * Update notebook & README, re-add missing file. * Update license header. * Re-include missing script. * Remove change in torch example script. * Fix flake8, black and isort format issues. --------- Co-authored-by: Holger Roth <hroth@nvidia.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Update setup_poc.ipynb (#2752) Add job templates arg to avoid "Unable to handle command: config due to: job_templates_dir='None', it is not a directory" error Use full name Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Added id to the jobAPI swarm_script_executor_cifar10 component deploy (#2678) * Added id to the swarm_script_executor_cifar10 component deploy. * codestyle fix. * Changed to use job.as_id(). * codestyle fix. * changed to use job.as_id(shareable_generator) for shareable_generator_id. * removed the un-necessary job.to() calls. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Sean Yang <seany314@gmail.com> * XGBoost plugin with new API (#2725) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Implemented LocalPlugin * Refactoring plugin * Fixed formats * Fixed horizontal secure isses with mismatching algather-v sizes * Added padding to the buffer so it's big enough for histograms * Format fix * Changed log level for tenseal exceptions * Fixed a typo * Added debug statements * Fixed LocalPlugin horizontal bug * Added #include <chrono> * Added docstring to BasePlugin --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Moved the simulator server logger init earlier. (#2753) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * [2.4] Secure XGBoost Documentation (#2671) (#2759) * add 2.4.2 documentation * update plugin configuration section * address comments * address comments 2 * change default plugin to cuda_paillier --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Getting started readmes (#2757) * add readmes * add note --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * fixed the CrossSiteEvalClientController in swarm_script_executor_cifar10 example. (#2762) * Cherry pick fixes from 2.4 (#2768) * Cherry pick launcher log fix (#2766) * Add flush=True to print in subprocess Output from print is usually buffered and may not appear on PIPE soon enough. * Replace logfile with the logging facility --------- Co-authored-by: Isaac Yang <isaacy@nvidia.com> * Update xgboost user guide (#2750) * Update xgboost user guide * add xgboost version --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Honor optional flag at streaming level (#2771) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added optional flag support in streaming layer * Removed the app_opt scan. (#2758) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add job API to support additional external dir in the custom dir (#2748) * Add job API to support additional external dir in the custom dir. * changed the behavior to copy external dir contents to job custom folder flat. --------- Co-authored-by: Sean Yang <seany314@gmail.com> * Moved the hello-pt example initialization to START_RUN, and store the downloaded dataset to each individual site. (#2735) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Fix config file name in doc (#2772) * Fix loading cli history in admin console (#2777) * Port 2.4 xgb changes (#2773) * Port 2.4 xgb changes * [2.4] Add xgboost metrics tracking cb (#2381) * add back changes * Fix unit test * implementation of Scaffold and FedProx for TensorFlow (#2727) * add alpha splitting * run experiments * add tensorboard writers; increase model size * fedopt version * add fedprox loss and callback * Update ModerateTFNet to match CIFAR10 torch implementation. * Fix multiprocessing GPU init error. Handle no alpha split case. * Add preprocessing to match torch CIFAR10 result. * Unify executor script for different algos. * Remove unused codes. * Add preprocessing steps to make TF results on par with torch examples. * Fix script executor args. * Add script to run all experiments. * Add README. * Fix graphs in README. * Modify TF FedOpt controller. * Update README and FedOpt result. * change the code in the NVFlare/examples/getting_started/tf for fedprox * change the code in the NVFlare/examples/getting_started/tf for fedprox * add the condition if fedprox_mu <0 NVFlare/examples/getting_started/tf for fedprox * Added Scaffold as an algorithm option * Providing a dedicated script for the usage of scaffold due to necessary code adjustments when using it * Helper for the usage of Scaffold as an algorithm * Main workflow for using scaffold * Changed path to scaffold workflow * Delete nvflare/app_opt/tf/scaffold_workflow.py * Update scaffold.py * Added clipnorm to the optimizer to handle empty tensors after long training of several epochs. * Update scaffold.py * Update cifar10_tf_fl_alpha_split_scaffold.py * Update scaffold.py * Create scaffold1.py * Update scaffold1.py * Update scaffold1.py * Update scaffold.py * Update scaffold1.py * Delete nvflare/app_opt/tf/scaffold1.py Not needed * Add files via upload * Added SCAFFOLD to the readme. * Update scaffold.py remove clipnorm from scaffold.py * Update cifar10_tf_fl_alpha_split_scaffold.py add clip_norm as an args to main function * Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. * Delete examples/getting_started/tf/figs/fedavg-diff-algos-new.png * Add files via upload * Delete examples/getting_started/tf/figs/fedavg-diff-algos-new.png * Add files via upload * changed accuracy values and plot * Update tf_fl_script_executor_cifar10.py Change the min_clients to num_clients based on the new changes * update doc * added support for TF 2.17 * Update scaffold.py to get just trainable layer names * delete the docs * add running scaffold job * Fix for models with non-trainable variables * Fixed the path of utils.py caused due to a typo * prepared scaffold for TF2.17 * prepared scaffold for TF2.17 * style changes due to failes tests * update the style * Update scaffold.py * update the style * update run_job.sh for scaffold and fedprox * update the style * revert the Style fix of unrelated files * revert the style changes * remove apidoc * remove apidoc --------- Co-authored-by: Holger Roth <hroth@nvidia.com> Co-authored-by: Zhijin Li <zhijinl@nvidia.com> Co-authored-by: falibabaei <falibabaee@gmail.com> Co-authored-by: LeoDuda <50424904+LeoDuda@users.noreply.github.com> Co-authored-by: uvecw@student.kit.edu <uvecw@hkn1990.localdomain> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: LeoDuda <leo.dud97@gmail.com> Co-authored-by: khadijeh.alibabaei <khadijeh.alibabaei@kit.edu> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * app_opt scan changes. (#2781) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Added error handling for XGB_CONFIGURED event (#2780) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added error handling in XGB_CONFIGURED event handler * Fixed formatting errors * Removed some redundant log entries * Addressed PR comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * fix for if torch and tensorflow are both installed (#2775) * Add FedJobAPI documentation (#2718) * add JobAPI docs * 2.5 misc doc updates * address comments --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * fixed the cross validation wrong config for swarm_script_executor_cifar10. (#2778) * Fixed the mgpu simulator workspace change error (#2770) * Fixed the mgpu simulator workspace change error. * codestyle fix. * Changed back the workspace.get_client_custom_dir(), fixed the sub_worker_process app_custom_folder. * Add the app_custom_folder in a proper way. * Update Secure XGBoost example w.r.t. XGBoost's code changes (#2686) * Initial commit for xgboost-secure * Initial commit for xgboost-secure * Change model output path * Change data mode * Most basic xgboost process for coding * Most basic xgboost process for coding * Most basic xgboost process for coding * Most basic xgboost process for coding * First prototype for secure vertical pipeline * Phase 1 concludes * add seal pipeline in C++ * experiment will more tree depth to ensure correct node behavior * experiment will more tree depth to ensure correct node behavior * update secureboost eval bench * set header to none for sample alignment * config processor interface from python * simplify data preparation, add horizontal testing codes * remove redundants * horizontal exps * update scripts * update test scripts * add feature tests * update to align all outputs' format * remove conflict * reorganize * format * add flare jobs * add readme and experiment results * update secure xgboost example to align with new xgboost branch * update secure xgboost example to align with new xgboost branch * add gpu scripts * modify split for gpu exp * modify split for gpu exp * refine readme with Yuanting's inputs * update gpu scripts * update gpu scripts * update gpu script * data preparation minor update * consolidate all testing scripts * update readme and standalone scripts * format update * format update * minor refinements * Improve the kill children processes (#2789) * use process.kill() to kill the children processes. * removed the sig argument. * removed no use import. * Add back metric callback and fix examples based on new xgboost version (#2787) * add docstring and cmd_data check (#2782) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add docstring to reliable message (#2788) * Pre-trained Model and training_mode changes (#2793) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Added support for pre-trained model * Changed training_mode to split_mode + secure_training * split_mode => data_split_mode * Format error * Fixed a format error * Addressed PR comments * Fixed format * Changed all xgboost controller/executor to use new XGBoost --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Update xgboost example and ci (#2794) * [2.5] Update flower CLI (#2792) * update flower cli * update flwr hello-world job (#9) * update flwr hello-world job * add license header * update readme --------- Co-authored-by: Holger Roth <6304754+holgerroth@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * replace comet with tensorboard (#2798) * more app_opt scan example changes. (#2797) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add first version of release notes (#2800) * add first version of release notes * revise release notes * FIX hard-coded sp_end_point in POC (#2795) * Add hello examples with new APIs (#2785) * add hello examples with new APIs * move and reorganize hello-examples to keep old ones in CI * remove prepare data for hello tf * update wording * update dates * update wording * update wording * remove note * add information about dataset --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update autofedrl example (#2801) * update autofedrl example to make it run correctly * remove redundant import * Refactor XGBDataLoader (#2804) * Fix docstring typo (#2802) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * re-arrange getting started examples (#2805) * re-arrange getting started examples * re-arrange getting started examples * fix README.md --------- Co-authored-by: Sean Yang <seany314@gmail.com> * Update secure xgboost examples (#2803) * Update secure xgboost examples * Update readme --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * XGBoost user interface change and XGBoost version check (#2808) * Updated FOBS readme to add DatumManager, added agrpcs as secure scheme * Changed split_mode to data_split_mode and added version check * Fixed format errors * Added lock in ReliableMessage (#2811) * Fixed 2 PTFileModelLocator config errors. (#2807) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Update xgboost example (#2813) * Update xgboost example * Add feedback --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Refactor Job API (#2799) * refactor fed job api * improve docstrings * refactor fed job api * improve docstrings * polish changes * fix bugs; check args * fix getattr * refactor * fixes and cleanup * added ccwf and flower jobs * added optional model selector for fed-avg * update hello-world examples * formatting and updates * address feedback * remove unnecessary stuff * add license text * detect duplicate executor --------- Co-authored-by: Yan Cheng <yanc@nvidia.com> * Add CUDA plugin code (#2814) * Add CUDA plugin code * Remove test file * Use CGBN submodule and moved shared codes out --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix jenkins CI (#2812) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Remove the module class scan (#2790) * remove the module classes scan, only add limit number of classes to use name search. * rename variables. * Added popular PTFileModelPersistor and PTFileModelLocator in the class_tables. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Change all name to path (#2817) Co-authored-by: Sean Yang <seany314@gmail.com> * Add back hello-numpy-sag and update references (#2816) * add back hello-numpy-sag and update references * reformat notebook * Revert "Remove the module class scan (#2790)" (#2819) This reverts commit b96dc326c46305a076d5acd59a3f10fa0faf408c. * Fix config typos (#2818) * Relaxed grpcio/protobuf versions (#2822) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * ScriptExecutor improvements (#2820) * script executor improvements * move ScriptExecutor to job_config * rename ScriptExecutor to ScriptRunner, add TF versions of in process and ex process executors * fix dead links --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * fix job api examples (#2823) * Support ScriptRunner in ccwf_job (#2825) * support ScriptRunner in ccwf_job * remove unused import * added object type check * ScriptRunner framework option in examples (#2827) * use framework option in examples * rename files * Use pre module scan to create classes table (#2824) * pre-scan the module to create classes table. * reformat. * changed the module pre-scan result to json file. * removed the no use import. * Improved create_classes_table_static() error message logging. * Add one entry in MANIFEST.in (#2826) Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * add nvflare day banner, auto hide highlights (#2829) * Fix existing xgboost examples (#2830) * Remove unused code and update README (#2828) * Fixed the config changes error. (#2834) * Minor fixes to xgboost example (#2832) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> Co-authored-by: Ziyue Xu <ziyuex@nvidia.com> * fix notebook errors (#2835) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Update requirements versions (#2831) * update requirements versions * update requirements versions --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * add NPModelPersistor to hello-fedavg-numpy (#2837) * improve the class_utils to handle the duplicate class name case (#2833) * improve the class_utils to handle the duplicate class name case. * changed error messages. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add migration guide (#2806) * add migration guide * update * update * update --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * fix hello-pt, empty metrics (#2840) * Update ml-to-fl examples with new APIs (#2836) * update ml-to-fl examples with new apis * address comments * add export config option * rename to launch_process --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add example notebook for docker (#2767) * add example notebook for docker * remove unneeded file * update base image to NVIDIA PyTorch container * update --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * hello-pt-mlflow job api example (#2839) * WIP * hello-pt-mlflow job api example. * Extracted the BaseFedJob for FedAvgJob and SAGMLFlowJob. * refactoried. * reformat --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Credit Card Fraud detection end-to-end with XGBoost (#2738) * wip end-to-end examples for enrich, process and xgboost restore changed file add readme.md update readme.md update readme.md update readme.md update readme.md update readme.md update code Update xgb notebook, readme, requirements.txt, as well the new version XGBoost, data loader style/import license headers restore XGBDataLoader 1. Anonymize the BIC code and bank names 2. update the changes to split_mode and secure_training_mode * address PR comments * fix the code due to the XGBoost and Job API changes * 1) clean up output 2) remove unused import --------- Co-authored-by: Ziyue Xu <ziyuex@nvidia.com> * rolled back the job api custom_file copy destination change. (#2848) * remove basename script conversion in ScriptRunner (#2849) * Update site code blocks and links (#2847) * update site code blocks and links * rename executor to runner * fix dtype error (#2852) * Convert step-by-step stats examples to use new Job API (#2842) * 1. add getting start notebook 2. convert df_stats from job template to job API * 1) add StatsJob to simplify the user experience 2) update both higgs and cifar10 stats using the new StatsJob to streamline the notebooks * format/style * update based on comments * update based on comments * switch the id prefix to real id * format style * Convert tree-based Fed XGBoost with Job API (#2843) * Convert XGBoost to Job API * format style * update based on comments * clean up * rename the executor to runner * Convert Scikit-Learn examples (SVM, Kmeans, Linear) to use Job API (#2845) * convert scikit-learn (Linear, SVM and Kmeans) to use Job API * format style * remove duplicate file * update based on comments * rename executor to runner * fix typo * tweaks * Added Debug in ReliableMessage and Ignore XGB errors after shutdown (#2851) * Added debug_info in ReliableMessage and ignore error after XGB shutdown * Removed redundant code * Added return in _handle_error --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update arg name for MLflowReceiver (#2850) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update step by step examples to use Job API (#2841) * Update client api to use same task as CSE and update step-by-step CSE (#2844) * Update client api to use same task as CSE and update step-by-step CSE * Update based on latest changes * Update swarm script runner * use runner * Autofedrl fix for updated locator behavior (#2856) * update model locator behavior * remove unnecessary changes * Convert CCWF examples to use Job API (#2846) * convert cyclic_ccwf swarm sbs and hello_ccwf examples * update based on job api change * change to numpy * uncomment export job * fix ci --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fixed the SubprocessLauncher missing app_custom_folder in the PythonPath. (#2857) * Add FLModel parameter checks (#2859) * clarify default persistor_id (#2861) * Added check for duplicate RM request (#2858) * Added check for duplicate RM request * Addressed PR comment * Add support of just doing metrics streaming with client api (#2763) * Add support of just doing metrics streaming with client api * Address review comments * Add flower metrics streaming example (#2764) * Add flower metrics streaing example * Fix format * Use context and RecordSet * Undo stuff * Update to new style * Update hello-flwr-pt_tb_streaming * Remove debug msgs * Update readme * Use flower job * Add missing code * Make client api type an arg * add docs for Flower integration (#2862) * update simulator folder path (#2865) * Removed the len() call which causes training failures (#2863) * Fed Stats Notebooks and Read ME: fix fed stats output directory due to simulator output structure changes (#2864) * fix fed stats output directory due to simulator output structure changes * cleanup output * rollback xgboost version changes * Fixed the wrong dh_psi_task_handler path. (#2866) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * improve race condition handling (#2867) * improve race condition handling * changed to use warning for late reply --------- Co-authored-by: Zhihong Zhang <100308595+nvidianz@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * support passing custom env vars for flower client (#2870) * fix cross-validation path (#2869) * Update Job API docs after redesign (#2873) * update job api documentation * change path to object * Updated xgboost user guide (#2872) * Updated xgboost user guide * Pinged the xgboost releaas to a specific version --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add pipe docstring (#2868) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update flower examples (#2871) * Clean up getting started installation docs (#2874) * clean up getting started installation docs * fix links and clean up top of page * reorganize getting started to primarily be in examples getting_stared README and update quickstart to contain installation * update README and notebook * Make the Launcher extends FLComponent. (#2875) * fix docs (#2877) * Fix heartbeat timeout config (#2878) * fix heartbeat timeout config * use TaskExchanger variable * Added more handling for the source file import handling. (#2876) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update the generated component classes table (#2879) * Update the generated component classes table. * Added back the MLflowReceiver. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix for last index of module path (#2881) * Update the generated component classes table. * Added back the MLflowReceiver. * Change to match the last module_path from source_file. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix hierarchical stats documentation (#2882) This patch fixes few typos in the hierarchical stats documentation and fixes the prepare_data python script. * update the tb path in fedbn example (#2883) * fix path due to simulator output structure changes (#2885) * Add note on installing nvflare in requirements (#2884) * add note on installing nvflare in requirements * fix typo --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * fix sbs notebooks (#2887) * Re-factor hello-numpy-cse example (#2880) * Update CrossSiteEval (#2886) * Update CrossSiteEval * Update base class * Undo no-need change --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add printing of tb logdir (#2888) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Update getting_started cifar notebook (#2889) * add TB streaming to notebook * remove unnecessary changes * Deprecate decorator pattern (#2891) * Deprecate client api decorator pattern * Update doc * Added instructions to run horizontal secure XGBoost in simulator (#2890) * Added simulator instructions for tenseal context * Fixed reference * Fixing target * Renamed provisioning target to xgb_provisioning --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Updated plugin build doc (#2892) * fix PSI and Vertical learning paths (#2893) * fix fed stats output directory due to simulator output structure changes * cleanup output * rollback xgboost version changes * fix path issues * restore some of old values * Fix ci test configs format issue (#2896) * remove bionemo from new (#2897) * update random forest and vertical xgb examples (#2895) * site, docs, example updates (#2894) * Update xgboost requirements (#2898) Co-authored-by: Ziyue Xu <ziyuex@nvidia.com> * Update flare simulator tutorial (#2899) * use correct tf model weights filename (#2901) * Add log info for flower executor (#2900) * Add log info for flower executor * Fix format * Fix hello-pt-cse job (#2905) * Undo remove bionemo from new (#2902) * undo remove bionemo from new * update --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add vertical xgboost gpu instructions (#2903) * Add vertical xgboost gpu instructions * Update xgb gpu --------- Co-authored-by: Ziyue Xu <ziyuex@nvidia.com> * Fix bionemo examples (#2904) * run task fitting * update sys info * fix SCL data split, use 1 gpu for ESM2 fine-tuning * restore run scripts * Fixed README format (#2906) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * update xgboost doc (#2907) * Added debug info for memoryview error (#2908) * Added debug info for memoryview errors * Fixed formatting issues * Change job simulator run to use Popen (#2909) * Changed the job API simulator_run to use Popen. * reformat. * Remove the no use import. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * fix hello_world tf result printing (#2910) * Fixed XGBoost Example README (#2913) * Changed split_mode to data_split_mode * Fixed a merging error * change to num_clients (#2914) * Fix data save path (#2917) * trim the whitespace of the clients and gpu from the job simulator_run (#2912) * trim the whitespace of the clients and gpu from the job simulator_run. * Added unit test. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add CSE with job api with client api (#2918) * change getting started examples to use BaseFedJob (#2919) * Added warning for mixed plugin use (#2920) * BugFix: Hierarchical Fed Stats, prepare data: replace os.rename() function (#2921) * replace os.rename() to shutil.move() to avoid os.error when destination and src are in different mnt or devices * remove redundant move * remove unused import * Note about Simulator in XGBoost Doc (#2911) * Added a note about simulator not supporting resources.json * Rephrased the sentence --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * add params_transfer_type to ScriptRunner (#2922) * Fix nemo examples (#2923) * fix prompt_learning * fix peft * Added the current-round info the fl_ctx for BaseModelController (#2916) * Added the current-round info the fl_ctx for BaseModelController. * reformat. * codestyle fix. * Moved the self.set_fl_context(data) call to broadcast_model(). * Change broadcast_model() to must send a FLModel, not None. * Changed the BaseModelController broadcast_model data default value, and a warning message to debug. * refactoried. * Updated docstring. --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Fix ci path (#2927) * Fix path * Fix path * Remove invalid validator --------- Co-authored-by: Sean Yang <seany314@gmail.com> * Fix xgb standalone fed (#2924) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fixing the memoryview issues (#2926) * Added handling for buffer overun * Added task_lock to read() and ignore duplicate chunks * Simplifed the wait loop * Fixed a formatting error * Check EOS when appending data --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Fixing memoryview error (#2929) * Fixed dup seq 0 bug * Formatting errors * update higgs data link (#2941) * update video links (#2937) * fix typo (#2939) * Add research examples to tutorial page (#2942) * add research examples to tutorial page * remove banner --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix doc and docstring issues (#2931) * Fix doc and docstring issues * Address comments --------- Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Add check for receive before send in client api (#2930) * Add flare series section, enhancements (#2948) * improvements, add series section * address comments * Bump tqdm from 4.66.1 to 4.66.3 in /research/condist-fl (#2557) Bumps [tqdm](https://github.com/tqdm/tqdm) from 4.66.1 to 4.66.3. - [Release notes](https://github.com/tqdm/tqdm/releases) - [Commits](https://github.com/tqdm/tqdm/compare/v4.66.1...v4.66.3) --- updated-dependencies: - dependency-name: tqdm dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump micromatch from 4.0.5 to 4.0.8 in /web (#2838) Bumps [micromatch](https://github.com/micromatch/micromatch) from 4.0.5 to 4.0.8. - [Release notes](https://github.com/micromatch/micromatch/releases) - [Changelog](https://github.com/micromatch/micromatch/blob/4.0.8/CHANGELOG.md) - [Commits](https://github.com/micromatch/micromatch/compare/4.0.5...4.0.8) --- updated-dependencies: - dependency-name: micromatch dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump dset from 3.1.3 to 3.1.4 in /web (#2936) Bumps [dset](https://github.com/lukeed/dset) from 3.1.3 to 3.1.4. - [Release notes](https://github.com/lukeed/dset/releases) - [Commits](https://github.com/lukeed/dset/compare/v3.1.3...v3.1.4) --- updated-dependencies: - dependency-name: dset dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add support to newer python version (#2951) * Upgrade formatter version for support higher version of Python (#2957) * Upgrade formatter version for support higher version of Python * Fix formatting issues * Fix unit tests * disable not working tests * add research links redirects (#2953) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Fix a bug in dashboard that server local resource file was not generated (#2964) correctly Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * improved fobs register_folder to catch ValueError. (#2958) * Add fedrag example with embedding training (#2915) * Add fedrag example with embedding training * fix link and format * fix link and format * fix link and format * keep rag folder structure, remove the retrieveal placeholder * keep rag folder structure, remove the retrieveal placeholder * remove template job preparation * remove template job preparation * update JobAPI script * update eval bash * update eval bash and result --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Bump rollup from 3.29.4 to 3.29.5 in /web (#2963) Bumps [rollup](https://github.com/rollup/rollup) from 3.29.4 to 3.29.5. - [Release notes](https://github.com/rollup/rollup/releases) - [Changelog](https://github.com/rollup/rollup/blob/master/CHANGELOG.md) - [Commits](https://github.com/rollup/rollup/compare/v3.29.4...v3.29.5) --- updated-dependencies: - dependency-name: rollup dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Bump path-to-regexp from 6.2.2 to 6.3.0 in /web (#2938) Bumps [path-to-regexp](https://github.com/pillarjs/path-to-regexp) from 6.2.2 to 6.3.0. - [Release notes](https://github.com/pillarjs/path-to-regexp/releases) - [Changelog](https://github.com/pillarjs/path-to-regexp/blob/master/History.md) - [Commits](https://github.com/pillarjs/path-to-regexp/compare/v6.2.2...v6.3.0) --- updated-dependencies: - dependency-name: path-to-regexp dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Bump vite from 4.5.3 to 4.5.5 in /web (#2950) Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 4.5.3 to 4.5.5. - [Release notes](https://github.com/vitejs/vite/releases) - [Changelog](https://github.com/vitejs/vite/blob/v4.5.5/packages/vite/CHANGELOG.md) - [Commits](https://github.com/vitejs/vite/commits/v4.5.5/packages/vite) --- updated-dependencies: - dependency-name: vite dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Add the hello-pt-resnet example (#2954) * Add the hello-pt-resnet example. * Removed the no use SimpleNetwork. * codestyle fix for hello-pt-resnet example. * renamed the simple_network.py -> resnet_18.py. And the resnet18 link to ReadMe. * updated license year. * codestyle fix. * black codestyle fix. * codestyle fix. --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Update CONTRIBUTING.md (#2969) * update PSI to support python 3.11 (#2972) * update PSI requirements.txt to support openmind-psi==2.0.4 which support python 3.11 * add comments * add web versioning (#2974) Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * [Main] Support object reuse (#2975) * support object reuse * fix formatting --------- Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * update openmind-psi to 2.0.5 for python 12 support (#2981) * Replace the distutils with shutil. (#2978) * Allow multiple workflows in CCWF (#2980) * support object reuse * fix formatting * allow multiple workflows in ccwf * allow multiple workflows in ccwf --------- Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * F3 Streaming Code Rewrite (#2960) * Ported F3 streaming rewrite code to main * Moved code reference class variables to RXTask class * Rollback changes to byte_streamer.py --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Pass components into script runner (#2983) * Fix tf model persistor and tf model (#2984) * add missing filter id arg in tf model persistor * Update TFModel * Address comment * Allow customization of BaseFedJob (#2985) * Add CommonComponentsJob * Fix format * Address comments * Fix issue * add umami analytics (#2987) * Update pt params converter (#2989) * update pt params converter * use exclude_vars * print warning * add return value * Bionemo demos (#2968) * updated bionemo demos to v1.8 * cleaned demos outputs for clarity * added linces and fixed naming README * fixed license headers and readme hyperlink * black fixing code * isort and flake8 fixes * addressing PR changes * removed unrequired infer copy file * updated other runs files/configs and fixed path in downstream notebook * fixed fedavg max_epochs setting to 1, removed extra data in taps yamls, fixed column used for each site * changed fedavg* and local* yamls to have site specific data for tap. for sabdab fedavg changed to original ??? * also sabdab local changed dataset.train config to ??? * tap fix configurations * update nb, nvflare version * use strict false * use full data for central training of sabdab --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Holger Roth <6304754+holgerroth@users.noreply.github.com> Co-authored-by: Holger Roth <hroth@nvidia.com> * Add FLARE DAY page (#2992) * add flare day page * add slides * move link location * add web speaker (#2999) * Fix doc typo and VDR reported issues (#2994) * BioNeMo: use multi threading but reduce num workers (#2996) * use multi threading but reduce num workers * revert nbs * update links * Update integration test script and upgrade tenseal and psi version (#2995) * Update documentation for Dockerfile, add location of tbevents, fix link (#2993) * update documentation for Dockerfile, add location of tbevents, and fix link * add comment for Dockerfile to explain difference * fix the entry for getting started in the TOC (#3007) * Expost init in client lightning api (#3004) * enhance web responsive design for mobile (#3010) * Redid the branch to cleanup the commits (#2986) * Update flwr job object, client, server (#3008) Co-authored-by: Sean Yang <seany314@gmail.com> * Bump cookie, @astrojs/mdx and astro in /web (#3002) Bumps [cookie](https://github.com/jshttp/cookie) to 0.7.2 and updates ancestor dependencies [cookie](https://github.com/jshttp/cookie), [@astrojs/mdx](https://github.com/withastro/astro/tree/HEAD/packages/integrations/mdx) and [astro](https://github.com/withastro/astro/tree/HEAD/packages/astro). These dependencies need to be updated together. Updates `cookie` from 0.5.0 to 0.7.2 - [Release notes](https://github.com/jshttp/cookie/releases) - [Commits](https://github.com/jshttp/cookie/compare/v0.5.0...v0.7.2) Updates `@astrojs/mdx` from 1.1.5 to 3.1.7 - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/main/packages/integrations/mdx/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/@astrojs/mdx@3.1.7/packages/integrations/mdx) Updates `astro` from 3.6.5 to 4.15.12 - [Release notes](https://github.com/withastro/astro/releases) - [Changelog](https://github.com/withastro/astro/blob/main/packages/astro/CHANGELOG.md) - [Commits](https://github.com/withastro/astro/commits/astro@4.15.12/packages/astro) --- updated-dependencies: - dependency-name: cookie dependency-type: indirect - dependency-name: "@astrojs/mdx" dependency-type: direct:production - dependency-name: astro dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * Add GNN encoder and xgb outputs for finance end-to-end example (#2970) * Readme notebook polish and cleanup * Reorganize folder structure and initial gnn * Complete the graph generate step with edgemap output * Format fix * Format fix * Add graph construction and training notebooks * Add full gnn functionality * Update wording for readme --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> * Fix fobs issue (#3011) * Fix fobs doc (#3012) * Remove the need to create additinal ports when running a job. (#3017) * Fixed broken doc ref to 'helm_chart' (#3022) * add none default values (#3025) * FedBPT: Fix fedbpt cma version (#3029) * fix cma version; upgrade nvflare version * upgrade python to 3.12 * upgrade openmined-psi version (#3020) Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Enhance POC notebook and docs (#3031) * 2.5 vdr enhancements * add table --------- Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> * Support multiple host names for FLARE server (#3018) * support multiple host names for fl server * add connect_to check * fix server side overseer agent * add server identity to fed_client.json * fix format --------- Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Isaac Yang <isaacy@nvidia.com> * multi line table (#3034) --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: Holger Roth <6304754+holgerroth@users.noreply.github.com> Co-authored-by: Chester Chen <512707+chesterxgchen@users.noreply.github.com> Co-authored-by: Zhihong Zhang <100308595+nvidianz@users.noreply.github.com> Co-authored-by: nvkevlu <55759229+nvkevlu@users.noreply.github.com> Co-authored-by: Arun Patole <apatole@nvidia.com> Co-authored-by: tonywjs <130029822+tonywjs@users.noreply.github.com> Co-authored-by: Yuan-Ting Hsieh (謝沅廷) <yuantingh@nvidia.com> Co-authored-by: Zhijin <zhijinl@nvidia.com> Co-authored-by: Sean Yang <seany314@gmail.com> Co-authored-by: Isaac Yang <isaacy@nvidia.com> Co-authored-by: Yuhong Wen <yuhongw@nvidia.com> Co-authored-by: Hao-Wei Pang <45482070+hwpang@users.noreply.github.com> Co-authored-by: Holger Roth <hroth@nvidia.com> Co-authored-by: falibabaei <66964597+falibabaei@users.noreply.github.com> Co-authored-by: falibabaei <falibabaee@gmail.com> Co-authored-by: LeoDuda <50424904+LeoDuda@users.noreply.github.com> Co-authored-by: uvecw@student.kit.edu <uvecw@hkn1990.localdomain> Co-authored-by: LeoDuda <leo.dud97@gmail.com> Co-authored-by: khadijeh.alibabaei <khadijeh.alibabaei@kit.edu> Co-authored-by: Ziyue Xu <ziyuex@nvidia.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Naev <63109284+NAEV95@users.noreply.github.com> Co-authored-by: Alessandro Giusa <148333702+agiusa@users.noreply.github.com>

yanchengnv added 2 commits October 9, 2024 12:46

support multiple host names for fl server

759eade

add connect_to check

9f927cb

yanchengnv requested review from IsaacYangSLA and nvidianz October 9, 2024 19:43

Merge branch 'main' into support_multi_host_names

bdcbc7a

nvidianz previously approved these changes Oct 10, 2024

View reviewed changes

yanchengnv added 2 commits October 10, 2024 17:13

fix server side overseer agent

9b28a4e

Merge branch 'support_multi_host_names' of github.com:yanchengnv/NVFl…

ec5a9c5

…are into support_multi_host_names

yanchengnv dismissed nvidianz’s stale review via ec5a9c5 October 10, 2024 21:14

yanchengnv and others added 3 commits October 11, 2024 11:42

add server identity to fed_client.json

e5d1625

fix format

cffa771

Merge branch 'main' into support_multi_host_names

c368791

IsaacYangSLA previously approved these changes Oct 11, 2024

View reviewed changes

IsaacYangSLA enabled auto-merge (squash) October 11, 2024 15:53

yanchengnv added 2 commits October 11, 2024 15:50

fix poc

30620e4

Merge branch 'support_multi_host_names' of github.com:yanchengnv/NVFl…

c6398fe

…are into support_multi_host_names

yanchengnv dismissed IsaacYangSLA’s stale review via c6398fe October 11, 2024 19:51

Merge branch 'main' into support_multi_host_names

dc2c2c3

IsaacYangSLA approved these changes Oct 11, 2024

View reviewed changes

IsaacYangSLA merged commit 552bb36 into NVIDIA:main Oct 11, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multiple host names for FLARE server #3018

Support multiple host names for FLARE server #3018

yanchengnv commented Oct 9, 2024 •

edited

Loading

nvidianz left a comment

nvidianz commented Oct 10, 2024

IsaacYangSLA commented Oct 11, 2024

IsaacYangSLA left a comment

yanchengnv commented Oct 11, 2024

IsaacYangSLA commented Oct 11, 2024

IsaacYangSLA left a comment

IsaacYangSLA commented Oct 11, 2024

yanchengnv commented Oct 11, 2024

Support multiple host names for FLARE server #3018

Support multiple host names for FLARE server #3018

Conversation

yanchengnv commented Oct 9, 2024 • edited Loading

Description

Types of changes

nvidianz left a comment

Choose a reason for hiding this comment

nvidianz commented Oct 10, 2024

IsaacYangSLA commented Oct 11, 2024

IsaacYangSLA left a comment

Choose a reason for hiding this comment

yanchengnv commented Oct 11, 2024

IsaacYangSLA commented Oct 11, 2024

IsaacYangSLA left a comment

Choose a reason for hiding this comment

IsaacYangSLA commented Oct 11, 2024

yanchengnv commented Oct 11, 2024

yanchengnv commented Oct 9, 2024 •

edited

Loading