Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support (decentralized) Normalization for Tabular Datasets #32

Open
davidroschewitz opened this issue Mar 4, 2021 · 0 comments
Open
Labels
feature New feature or request

Comments

@davidroschewitz
Copy link

For tabular datasets (popular examples: adult income and titanic), normalization is critical for neural network approaches.

The most typical and a very effective way to normalize is to "subtract the mean and divide by the standard deviation". However, computing these in a decentralized fashion is non-trivial. For DeAI to support this, additional functionality needs to be implemented.

Examples of how this can be addressed:

  • Provide means and standard deviations for all features based on some a-priori knowledge. Each participant is then asked to normalize their data according to this standard before uploading.
  • Learn means and standard deviations as a pre-learning task, which is then automatically applied to each local dataset. This could be a full DeAI training cycle, or a simple weighted average which is democratically communicated.
@martinjaggi martinjaggi added the feature New feature or request label Mar 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants